Why You’re Wrong If You Think Java Is Slow And C Is Fast (Or Vice-Versa)

I’ve been meaning to write something like this for a bit. This post isn’t for you experts: You’ll understand this all already. I’m noting this here because I haven’t yet seen it explained, debunked and critiqued in an easily accessible format as of yet.

Here follows a selection of YouTube comments I have found recently.

Oh, please. The user experience is that if the program is written in Java it is likely to crash. The reasons for that may be interesting to dissect, but it does not change the fact that the user experience is that Java programs crash. And they are slow, too. Again, a post mortem might be interesting. But it doesn’t change the fact that Java programs are slow.

For anybody that has ever written C and C++, it’s clear that you know nothing about C/C++. Performance difference between C and C++ is negligible, and both are at least 300% faster than Java, in general any of the two is around 2000% faster than Java. Hell!! Even Python is faster than Java, and i strongly dislike Python.

Oh, a hardcore Java hater. You are simply ignoring the fact that in the posted source Java performs better than Python in every single benchmark.

I’ve seen this repeated a lot across the web. Newcomers to programming seem to thoroughly enjoy pitching their relative camps and arguing about the advantages and disadvantages of this or that language. Heck, I used to do it once upon a time. I was wrong then, and they’re still wrong now.

The Bigger Sword Phallacy Fallacy

It’s common and tiring to hear such arguments. The most frequent issue of contention appears to be about the relative speed of 21st Century languages like Java (admittedly pre-2000), Python and C# versus typically 20th Century languages like C, C++ and Assembly.

I assume this all stems from the illusion that programming – and particularly the quality of software produced – stems entirely the tools used. This is particularly common among those that approach software development not from an abstract, mathematical position but from a practical one. Rather than attempt to critique the details and implementation of code, it’s far easier to simply slander the tools used to make it – particularly if your knowledge of programming is limited.

Often, this form of rotten debate is only amplified by the fact that those arguing the case of each side often has no experience in the other and hence suffers from Anchoring Bias.

In truth, the software a programmer produces is only as fast, efficient and maintainable as their own abilities permit. Bad code will always be bad code, no matter what language it is written in.

So What Is A Programming Language Anyway?

A programming language, in its purest form, is a set of rules that describes a logical program. We could get into mathematical definitions – the Chomsky–Schützenberger hierarchy, formal grammars, etc. – but this isn’t useful for practical arguments. At the end of the day, a programming language is a specification, and not a platform. This misunderstand, and often inability to differentiate between these two things, lies at the root of erroneous arguments about language superiority.

Compilers, interpreters and assemblers often produce intermediate code that acts as a halfway-house or a common denominator between languages and platforms. For example, GCC makes use of GIMPLE, an intermediate program representation to translate between different languages, platforms and output forms. Often it’s useful to think of these intermediate platforms more as platforms and not components of a language in their own right, because they aren’t part of the specification. For example, Java is usually translated into Java bytecode. This does not necessarily mean Java has performance issues, because it is possible to skip this step and compile straight to native machine code (GNU Compiler for Java (GCJ) does this).

It’s important to separate criticism of language syntax and paradigms from criticism of platforms, compilers and intermediate forms. This post will mainly deal with the latter; I’m by no means an expert on language syntax, and don’t aim to lecture on a subject I would be uncomfortable talking about with any authority.

Misinterpreting Interpretation

A common argument used to justify the claims of efficiency and speed attributed to lower-level languages like C and C++ is that other languages are ‘interpreted’. This vague and rarely explained term is thrown around like some gospel evidence of a language’s inherent inferiority.

In reality, interpretation covers a vast range of mechanisms that allow a program’s execution. Often interpreters can actually compile a program at run-time before execution into native machine code, a method that will often generate faster and more memory-efficient executable code than that produced by some C and C++ compilers. Here follows a quick summary of just some interpretation and compilation techniques.

Real-Time Interpretation / Just-In-Time Interpretation (JIT)

This method involves reading the source code of a language and translating it line-by-line, as the code is run, into a set of instructions that can be executed by the computer. Needless to say, this method of execution is usually very slow. This was usually done for older programming languages such as BASIC.

A variant on this method, which usually makes for more irregularly slow performance (but better optimisation in performance-critical code) is to translate entire units of code (such as functions, classes and modules) into a low-level form as and when the units are required by the program. I believe that the current version of Cython (the most popular Python interpreter) makes use of this method (optionally, also Bytecode Interpretation, mentioned later).

Ahead-Of-Time Compilation (AOT)

AOT compilation is perhaps closer to ‘pure’ compilation than other forms of interpretation. It involves translating an entire program into native machine code before execution even begins. This method makes for extremely fast code, but comes at the expense of a slow start-up time.

Bytecode Interpretation

Bytecode is an intermediate form between a program’s source code and native machine code. It involves a low-level code, usually consisting of extremely simple instructions, that is interpreted at run-time by an interpreter. Bytecode must be compiled from the original source code, but may then be distributed across many different platforms. The advantages of this method are that it requires a considerably simpler (and hence faster) interpreter and that most optimisation procedures can be performed before the distribution of a program to improve start-up times and performance. Most Java Virtual Machines (JVMs) make use of this method, including the Dalvik interpreter shipped with previous versions of Android.

Bytecode Compilation / Dynamic Recompilation

This method has the same advantages of the previously mentioned Bytecode Interpretation. However, instead of interpreting bytecode as-is, it additionally involves translating the generic bytecode into native machine code and then executing that. Additional optimisations can be made throughout this process, and results in extremely fast executable code with relatively low start-up times. This is the method employed by Android’s new Android Run-Time (ART).

‘Pure’ Ahead-Of-Time Compilation

This is the method most commonly meant by the term ‘compilation’. It involves compiling source code directly into native machine code ready to be distributed. Obviously, this results in extremely fast executable code with no start-up time (other than that required to load the executable into memory and initialise data structures). The disadvantages are also obvious. Since this method involves distributing native machine-code, it means that the distributed executable is platform dependant. If the code is not distributed for a particular platform, the only way it can be run is through hardware emulation. This is a very different topic altogether.

A Bad Worker Blames Their Tools – Or The Tools Of Other People

The world of software and computers is complex, confusing and chaotic. There are few practical standards, despite what ISO would like you to believe. A immense variety of platforms, methodologies, and software requirements require an immense variety of tools for programmers to make headway amongst the fragmented mess that is the modern technology sector.

To claim that one language is inherently superior and that others are redundant is the digital equivalent of claiming that toolboxes aren’t really useful and that really we should only ever use hammers for everything. It’s ridiculous, naïve, and only squashes the genuine criticism of methodologies and techniques that the industry so desperately needs.

Happy Coding! – Whether you’re using C, Java, C++, Python, Javascript, Rust, Ruby, Ada, BASIC, C#, Swift, Objective-C, CoffeeScript, Clojure, Haskell, Delphi, C, BASH, Visual Basic, Lua, OCaml, Scala, Go, R, F#, Lisp, MATLAB or any other.

2 comments

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s