Any hard data on GC vs explicit memory management performance?

0 votes
asked Apr 16, 2009 by carl-seleborg

I recently read the excellent article "The Transactional Memory / Garbage Collection Analogy" by Dan Grossman. One sentence really caught my attention:

In theory, garbage collection can improve performance by increasing spatial locality (due to object-relocation), but in practice we pay a moderate performance cost for software engineering benefits.

Until then, my feeling had always been very vague about it. Over and over, you see claims that GC can be more efficient, so I always kept that notion in the back of my head. After reading this, however, I started having serious doubts.

As an experiment to measure the impact on GC languages, some people took some Java programs, traced the execution, and then replaced garbage collection with explicit memory management. According to this review of the article on Lambda the ultimate, they found out that GC was always slower. Virtual memory issues made GC look even worse, since the collector regularly touches way more memory pages than the program itself at that point, and therefore causes a lot of swapping.

This is all experimental to me. Has anybody, and in particular in the context of C++, performed a comprehensive benchmark of GC performance when comparing to explicit memory management?

Particularly interesting would be to compare how various big open-source projects, for example, perform with or without GC. Has anybody heard of such results before?

EDIT: And please focus on the performance problem, not on why GC exists or why it is beneficial.



PS. In case you're already pulling out the flame-thrower: I am not trying to disqualify GC, I'm just trying to get a definitive answer to the performance question.

14 Answers

0 votes
answered Jan 14, 2009 by soru

One pragmatic issue is that with explicit MM it is generally a lot easier to profile, identify the bottleneck, and resolve it.

With a GC system, when your nice O(N) code turns out to trash the GC in a pathological way that makes it O(heap size), it can be harder to work out what is going wrong. Sometimes even as hard as fixing a memory corruption.

0 votes
answered Jan 16, 2009 by andrew-hare

It is the fact that developers are human and miss things that caused the need for garbage collectors in the first place. With that being said let me say that garbage collection will always be slower than perfect explicit memory management. And garbage collection can often be faster than imperfect explicit memory management given the fact that garbage collectors clean up the things that developers tend to forget.

0 votes
answered Jan 16, 2009 by huaghaguah

In theory, a well profiled program can inform an intelligent GC subsystem to attain the described speedups over manually memory management. These speedups may not be visible without long runtimes, to amortize the fixed startup cost of GC.

In practice, you will likely NOT realize these speedups with present-day GC implementations. Furthermore, you will NOT get a definitive answer, because there will always be pathologically bad scenarios for both cases.

0 votes
answered Jan 16, 2009 by zifre

In theory, GC may be faster in some cases, but I have never seen that, and I doubt I ever will. Also, C++ with GC such as the Boehm GC will probably always be slower because it is conservative. With all the pointer fiddling in C++, the GC has to pretend everything is a pointer. With a language like Java, it can know exactly what is and isn't a pointer, so it may have the potential to be faster.

0 votes
answered Jan 19, 2009 by brian

As @dribeas points out, the biggest 'confound' to the experiment in the (Hertz&Berger) paper is that code is always written under some 'implicit assumptions' about what is cheap and what is expensive. Apart from that confound, the experimental methodology (run a Java program offline, create an oracle of object lifetimes, instrument back in the 'ideal' alloc/free calls) is actually quite brilliant and illuminating . (And my personal opinion is that confound does not detract too much from their results.)

Personally, my gut-feel is that using a GC-ed runtime means accepting a factor-of-three performance hit to your application (GC'd will be 3x slower). But the real landscape of programs is littered with confounds, and you'd be likely to find a huge scatterplot of data if you could perform the 'ideal' experiment on lots of programs across many application domains, with GC sometimes winning and Manual often winning. (And the landscape is continually changing - will the results change when multicore (and software designed for multicore) is mainstream?)

See also my answer to

Are there statistical studies that indicates that Python is "more productive"?

which has the thesis that "due to so many confounds, all evidence about software engineering is anecdotal".

0 votes
answered Jan 19, 2009 by brian

See also

which discusses the "sufficiently smart" compiler. The landscape of CS/software is riddled with ideas/techiques which can be in theory more performant than the status-quo. But it's all snake oil.

GC is expensive today, and may always be.

0 votes
answered Jan 19, 2009 by brian

Side note: another interesting experiment to run, that I haven't seen people try, is to compare with just leaking. Call alloc and never free. It's an interesting alternative.

Except for long-running server apps, you'll never run out of memory in practice, the OS will just start using disk for virtual memory (machines have effectively infinite memory, up to the limits of virtual address space, which I think is huge now with 64-bit machines). This highlights that a GC is nothing more than a device for improving locality. Leaked/dead objects don't "hurt" when you have infinite memory, except that memory comes in hierarchies and you want to keep the 'live' objects nearby and fast and the 'dead' objects on the faraway/slow memory. If each object was allocated on a different page, then the OS virtual memory system would effective be a GC.

0 votes
answered Jan 19, 2009 by barry-brown

GC will always be slower than the extreme alternative: perfect, non-deterministic memory management.

The questions are:

  • Are the differences large enough to quibble about?
  • Are the drawbacks of one technique enough to cause us to seriously consider the other?

There are other areas in which managed subsystems have won out over unmanaged ones:

In general, a program will always run slower on a multitasking operating system than on a uni-tasking one -- or a computer with no OS.

In general, a program will always run slower on a system with virtual memory than on one without.

Except in extreme circumstances, do we seriously consider computer systems without VM and without an OS?

0 votes
answered Jan 27, 2009 by norman-ramsey

Berger's paper is being cited a lot, but it is comparing real garbage collectors against a purely theoretical, offline, optimal algorithm. So while it may tell you something about theoretical limits, it says very little about the performance of real garbage collectors versus real implementations of malloc and free. A study that I like better took real programs and compared explicit malloc and free with Hans Boehm's conservative garbage collector:

The Measured Cost of Conservative Garbage Collection by Ben Zorn

This study isn't perfect, and Zorn is careful to note that if the programs knew they were using a garbage collector, some could be made faster. But the hard data is this: - In real programs originally written to use malloc and free, garbage-collected versions run at about the same speed but require twice as much memory. Zorn argues fairly convincingly that if you know you have GC, you can make things faster, but it's hard to eliminate the memory penalty.

I learned more from this careful experimental study than from Berger's study of an unimplementable, idealized memory manager.

0 votes
answered Apr 16, 2009 by jasper-bekkers

The cost of memory allocation is generally much lower in a garbage collected memory model, then when just using new or malloc explicitly because garbage collectors generally pre-allocate this memory. However, explicit memory models may also do this (using memory pools or memory areas); making the cost of memory allocation equivalent to a pointer addition.

As Raymond Chen and Rico Mariani pointed out, managed languages tend to out perform unmanaged languages in the general case. However, after pushing it, the unmanaged language can and will eventually beat the GC/Jitted language.

The same thing is also evident in the Computer Language Shootout because even though C++ tends to rank higher than Java most of the time, you'll often see C++ implementations jumping trough various hoops (such as object pools) to achieve optimal performance. Garbage collected languages, however, tend to have easier to follow and more straight forward implementations because the GC is better at allocating (small chunks of) memory.

However, performance isn't the biggest difference when it comes to GC vs non-GC; arguably it's the deterministic finalization (or RIIA) of non-GC (and reference counted) languages that is the biggest argument for explicit memory management because this is generally used for purposes other than memory management (such as releasing locks, closing file or window handles et cetera). 'Recently' however C# introduced the using / IDisposable construct to do exactly this.

Another problem with garbage collection is that the systems they use tend to be rather complex to prevent memory leaks. However, this also makes it way more difficult to debug and track down once you do have a memory leak (yes, even garbage collected languages can have memory leaks).

On the flip side, the garbage collected language can do the most optimal thing at the most optimal time (or approximately) without having to burden the developer with that task. This means that developing for a GC language might be more natural, so you can focus more on the real problem.

Welcome to Q&A, where you can ask questions and receive answers from other members of the community.
Website Online Counter