Replies: 11 comments 11 replies
-
I think this paper is wicked cool. The authors observe that, although tracing and reference-counting GCs are traditionally seen as completely incomparable, optimized versions of these algorithms are actually different in a more comparable and logical way. The average language designer wants some kind of Goldilocks GC that lives between these two versions, and so it is helpful to be able to compare the two GC styles directly and make more informed choices when rolling one's own Goldilocks GC. I imagine that, when making these choices, one can then factor in language-specific information that one happens to know. For instance, does this language buck the generational hypothesis for some reason? Does it frequently allocate massive contiguous chunks of memory for some reason? Is stopping the world particularly tolerable/intolerable? I'd be curious as to what the class thinks of the optimizations that the authors tack on over sections 3 and 4. It feels somewhat important that "users" of this paper subscribe to these optimized versions, and obviously this comes at the cost of choosing some other optimization strategy that, for whatever reason, breaks the duality that this paper so carefully achieves. Does this mean that this paper is somehow trapped in amber? |
Beta Was this translation helpful? Give feedback.
-
Loved this paper. It had me thinking, "obviously tracing and reference counting are algorithmic duals --- they're solving the same problem!" sort of like Prim's and Kruskal's algorithms for finding MST. Regardless, I'm still wondering how to apply the learnings of this paper beyond appreciating it for it's theoretical contribution and the cost model which might help making design decisions creating making garbage collection systems. FWIW, those two things already seem pretty groundbreaking but I'm curious to see what ideas other people have/had (especially PL people). |
Beta Was this translation helpful? Give feedback.
-
I also really enjoyed this paper. Reframing the solution space in this way gives a very unique insight into the fundamental problems of the field in a way that few other papers are able to accomplish. I imagine this paper has fundamentally changed many people view garbage collection. This kind of ties into @5hubh4m's question, but it would be very interesting to see if any future work was able to use this new framing to develop garbage collection algorithms with different tradeoffs than those that existed before this paper was written. At a quick look I didn't see any papers along those lines, but if you imagine a somewhat continuous design space between reference counting and tracing I could see there being design points that were not fully considered before this shift in perspective. Maybe there are some specific use cases where a very specific tradeoff between latency and throughput is required. In these cases, it would be interesting to see if choosing the proper garbage collection algorithm could be viewed as choosing a specific point in the reference counting vs tracing design space. |
Beta Was this translation helpful? Give feedback.
-
This paper is really long, but the key idea is very concise and shocking to me. I have never thought about tracing and reference counting can be dual problems. Yeah, it is intuitive to “think of tracing as operating upon live objects, while reference counting operates upon dead objects”. However, the authors did not stop here, they even proposed the framework of using fix-point formulation to prove and unify those garbage collection algorithms. It seems the framework is elegant and can cover the mainstream algorithms. Since this paper was published in 2004, I also wonder what new garbage collectors emerged in these two decades, and can they be categorized into this framework? The RL-based GC [MAPL’20] mentioned by @gsvic should be an interesting read, and I believe there should be something new beyond this framework. |
Beta Was this translation helpful? Give feedback.
-
It is interesting to see how ideas can be abstracted that far in order to make connections like these (e.g. that two different GC techniques are duals). Looks like this is a common pattern in research, as this reminds me the following similar work on the duality of operating systems (message-oriented vs procedure-oriented): http://cgi.di.uoa.gr/~mema/courses/m131/papers/lauer78.pdf I liked the paper because it contained a thorough analysis on the different techniques. Taking into account that GC is configurable, I believe that this paper could give a deeper understanding also to developers who would like to modify the various GC-related tuning nobs of their language. |
Beta Was this translation helpful? Give feedback.
-
This paper provides a unique way to see the two fundamental GC approaches: tracing and reference counting, and claims that all realistic, high-performance collectors are in fact hybrids that combine tracing and reference counting, which lies on the design space this paper proposes. I'm just wondering what algorithms do realistic GCs (used in Python, Java, Lisp, ...) use, and can these algorithms also be mapped into the design space this paper proposed? Is the unified theory able to explain all the real-world GCs: their characteristics and trade-off? |
Beta Was this translation helpful? Give feedback.
-
I thought this paper presented an important insight between tracing and reference collection, which I never anticipated. Even the algorithms as formulated in figures 3 and 4 demonstrate how similar the ideas are. It is also interesting to me how they can be related by the fix point formulation of garbage collection. |
Beta Was this translation helpful? Give feedback.
-
I think this paper adds to an interesting fact in research: that is, even on a topic that has been researched for 40 years, people can still find new insights just by formulating the problem in a different way and unifying existing solutions. The real value of this paper to any follow-up research in my opinion is that with this duality, we now have a very clear and quantifiable search space on GC, and we could just optimize over this search space based on different constraints. One concern I am having is that this paper seems to ignore some aspects, including unreachable memory. Although I feel like taking these aspects into consideration might break the elegance and conciseness of this formulation, the author should still offer more discussion on these issues. |
Beta Was this translation helpful? Give feedback.
-
I found the fix point formulation of garbage collectors especially interesting, and how it relates to calling these systems "duals". I really never would have guessed that ref counting and mark and sweep are doing the same thing! It was also interesting to read a more comprehensive review of all of the ways of dealing with cycles in reference counting, as that seems like a very attractive upgrade. Also in the same vein, it was interesting to read about the "everything in between" of ref counting and tracing, as this is probably where the "best" solution lies, depending on the use case. |
Beta Was this translation helpful? Give feedback.
-
I think this paper is very useful because it provides a systematic way to reason about the tradeoffs in different design choices of a garbage collector. Other than that, it reminds me of some GC hardware acceleration paper I encountered before. As the paper points out, one major drawback of tracing is the long pause time because it only frees dead objects at the end of a collection cycle. Apart from breaking it down to smaller chunks or do it concurrently with the mutator, there's also an interest to build co-processors or accelerators to do the garbage collection. For example, this paper from ISCA'18 builds a non-intrusive co-processor to accelerate tracing GC. |
Beta Was this translation helpful? Give feedback.
-
I hope it’s not too late to chime in here, but I thought this paper was super fascinating. I had never thought of tracing and reference counting as being duals of each other before, so this paper helped me think about these algorithms differently. I especially liked the fix-point formulation section since I thought the formula presented was really elegant. This paper talked in-depth about a cost model for garbage collectors, and did a cost analysis on tracing, RC, and several hybrids. I think it would have also been interesting to also see a cost analysis on garbage collectors for widely-used programming languages to see how they compare (we might have talked about this during the discussion already). |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
This is the discussion thread on A Unified Theory of Garbage Collection by David F. Bacon, Perry Cheng, V.T. Rajan.
Discussion hosted by @Yasgur99 @orkosinha
Beta Was this translation helpful? Give feedback.
All reactions