Replies: 16 comments 5 replies
-
I was unfamiliar with Smalltalk, SELF, and didn't fully understand the motivation behind prototype-based programming before reading this paper. The ability to change the behavior of an object at runtime is useful and I understand why SELF is a prototype language. Discussion Question |
Beta Was this translation helpful? Give feedback.
-
I thought the discussion of the approaches to optimization in Smalltalk versus Self at the end of Section 4 raised good questions about the tradeoffs of having more or less information available to the compiler statically. The authors point to the special handling of common cases in Smalltalk-80 systems as violating the "extensible and flexible spirit" of the language. In this context it was interesting to consider how the compilation of Self attempts to sidestep some of these issues. The crux of the approach seems to be attempting to dynamically infer a small set of information that can enable optimization (the type of receiver) while preserving the ability to fall back to the default slow (but flexible) handling. As the customized compilation approach seems to involve storing multiple versions of the compiled code, I think it could have been helpful to have more evaluation/analysis of the performance impact of incremental recompilation. Discussion question: Self uses maps to efficiently represent objects that belong to the same clone family, which as the authors point out look similar to classes but are transparent at the language level. In what situations does this transparency benefit the programmer? What is an example of concrete functionality enabled by this transparency? |
Beta Was this translation helpful? Give feedback.
-
I found the discussion on Customized Compilation (§5.1) interesting. The way the SELF compiler generates different machine code for each possible type a method can be applied to reminds me of monomorphization in languages w/ parametric polymorphism (e.g., Haskell, Rust). In monomorphization, given a polymorphic function, the compiler instantiates type variables with concrete types and produces a specialized version of the function. The type prediction behavior discussed in §5.5 seems particularly intriguing, as it demonstrates how the compiler exploits the fact that certain methods are more likely to used with specific types to generate optimized run-time tests. How did the authors determine a priori what types are more likely to be used with certain methods — did they do this in an ad-hoc manner by inspecting existing Smalltalk code? Question: When performing customized compilation / “monomorphization” as discussed above, how do we handle the tradeoff between performance and code size? (Monormophization results in code duplication, since a copy of the same function has to be generated for different types.) The evaluation section in the paper only discusses performance and not code size — does this mean compiler designers prioritize performance over code size in practice? |
Beta Was this translation helpful? Give feedback.
-
My reading of this paper was handicapped by my lack of previous interactions with SmallTalk and SELF. As an example, I never quite wrapped my head around blocks: what are they for, in what sense do they allow SELF programmers to "define their own control structures"? The authors compare blocks to closures, so maybe they are just closures. I also found the paper a little hard to read because it seemed to describe many different kinds of optimizations. Some that were just for prototype-based languages, some that were for all dynamically-typed OO languages, and some were that for all OO languages dynamic or static. So I was constantly re-orienting myself, trying to understand what class of languages a given optimization might apply to. All that being said, I liked the paper and I thought it offered some nice compiler optimizations. I particularly liked the first section on dynamically constructing what is essentially the class hierarchy of the running program. It made a compelling argument that all of the type/class information that programmers provide statically is basically superfluous to the runtime system. Questions:
|
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
I found this paper interesting mainly because Javascript is a prototype-based language, and having dealt with all the quirks of JS for many years now it was interesting to see the arguments for prototype-based languages and the thought going into optimizing them. I do think it is worth looking into optimizing (simply because of how ubiquitous javascript is) but I will have to say as a whole I am not the biggest fan of prototype systems in general. My main critique is that the paper seems to be solving a very specific problem (optimizing the SELF compiler) and I am not sure that most of the lessons learned here are that applicable to more widely used languages. Many of the optimizations mentioned (like JIT compilation of functions for specific types dynamically) also seemed not so revolutionary, though I think that is mainly because the paper is from 1989 so I'll cut it some slack. I think as a whole I was not particularly surprised or excited by this paper, though I'm not sure how much of it is my distaste of prototype languages and how much was just that the paper is old and many things discussed are now fairly ubiquitous (maybe a sign of its success?). Discussion QuestionPrototype-based languages provide a very high level of flexibility to users, allowing not only dynamic type systems but dynamic data/struct layouts and more. As such, the lack of structure forces compiler engineers to work with less information, which naturally prevents many opportunities for optimization. At the same time, increasing the dynamic runtime capabilities of the programmer also heavily correlate with lack of application security (prototype pollution, etc) as fewer constraints on program behavior force developers to rely on fewer things. Where should language developers draw the line between flexibility and performance/security? If (as the paper states) it is extremely rare to create multiple clone families, then why even allow the developer to modify object layouts at runtime? What are some ways to minimize restrictions and load on developers while maintaining performance guarantees in dynamically-typed scripting languages? |
Beta Was this translation helpful? Give feedback.
-
Critique: Like many others in this thread, I found this paper a little bit hard to get through this week; to be honest, I mostly was interested in it as a relic of an earlier era of CS research. I'm not familiar with Smalltalk, which made it a bit hard to situate myself in the specific improvements of SELF over Smalltalk, but it was interesting to see how new (I gather from the way they were talking about it) the field of dynamically typed languages was, to the point that this analysis of optimizations on a language that is not really used anymore in the present day was then so important. Discussion Question: In the conclusion, the authors mention: "Our techniques are not restricted to SELF; they apply to other dynamically-typed object-oriented languages like Smalltalk, Flavors, and CLOS. Many of our techniques could even be applied to statically-typed object-oriented languages like C++ and Trellis/Owl." I wonder if the techniques were indeed applied to other languages (perhaps some still in more modern use) based on this paper? This may help people to see the worth behind this paper (I've seen some uncertainty around this topic in the threat) and an answer to the question, 'why should we care'. |
Beta Was this translation helpful? Give feedback.
-
I think this was probably the hardest paper to understand that we've read so far. I'm familiar with JavaScript, another prototype-based programming language, but it was still quite hard to wrap my head around the features of SELF described in the paper and map them to JavaScript concepts. The descriptions of how messages are sent to objects actually tripped me up a bit, because I was thinking of messages as in distributed systems, not method calls on objects that SELf messages seem to be closer to. A lot of the paper seemed to be very specific to the SELF compiler and I wasn't immediately seeing how these optimizations could be applicable to prototype-based languages in general, although maybe this was because I was trying to find a JavaScript equivalent for everything I didn't understand in the paper. My discussion question comes from the section of the paper outlining how the SELF compiler supports source-level debugging. From reading that section and experience with C/C++, it seems clear that there is an inherent trade-off between code optimization and ease of outputting useful debugging information, at least to the level of adding significant complexity to debuggers operating on optimized code. However, is the utility gained from a debugger being able to work on JIT-optimized code worth the added complexity/reduced functionality versus a debugger that just works by interpreting the code line-by-line? In what cases would the former be more useful than the other? |
Beta Was this translation helpful? Give feedback.
-
The wiki page about SELF was very useful and helped put the paper in some context. For me, it is often the case with older papers that I simply do not understand what the authors are talking about. It was very validating to see others describe the paper as 'hard to read' and 'hard to relate'; I feel the same about it. It surprised me that the language is alive and had its latest release in 2024. Lately, I have been thinking that I do not like untyped languages. It makes it harder for me to write or read code if I can not easily conclude what the type of the arguments or returned objects is. I wonder how others feel about it. When is it preferable to use typed vs untyped languages? |
Beta Was this translation helpful? Give feedback.
-
Overall, it was nice to see how the ideas in the paper could be widely adopted in modern JIT compilers for dynamic languages such as JavaScript, a prototype-based language. The compiler optimizations were interesting, particularly how the SELF compiler minimizes space usage of clones derived from the same prototype by using clone families. The authors use maps as an implementation technique to efficiently represent members of a clone family, and their reasoning for this is pretty clear, as maps allow multiple objects to share a single metadata structure, and their immutability further enhances their utility by ensuring that changes to one object do not inadvertently affect others in the clone family. The authors also talk about message inlining and splitting as part of their compiler optimizations too. As they mention, message inlining is helpful because it eliminates the overhead in SELF of using message passing to access variables. While message inlining improves performance by reducing runtime overhead, it heavily relies on accurate type information and can lead to extensive code during compile time. I would suppose that this trade-off between speed and memory usage may create challenges in memory-constrained environments, but maybe there are already workarounds to this that I am unaware of. Discussion questions: |
Beta Was this translation helpful? Give feedback.
-
I appreciate the background the authors provide on SELF. I didn't understand some parts of it, but I think I understood enough to get why its a good candidate for the type of dynamic compilation the authors implement. I'm not really that familiar with Javascript so I didn't know about prototypes, which I guess made it harder to understand the language. But basically I was thinking of it vaguely as an object-oriented programming model that doesn't use classes, but has to do a similar dynamic dispatch/lookup thing? I did think it was pretty crazy they doubled the compiler's performance over the existing one -- I can't imagine a paper coming out today that suddenly makes a mainstream-ish language twice as fast as it was just before. And also it reminded me of Proebsting's law from the first lecture, lol. I also thought the authors raised an interesting point when comparing their compiler to a C compiler. They attribute a lot of the slowdown in their compiler to the fact that they have poorer implementations of standard compiler operations like register allocation and peephole optimizations. In this way, it seems pretty unfair to compare their compiler developed by 3 people to a mature C compiler that is faster at least because of the sheer person-hours that have gone into it. They attribute some of the slowdown also to the difference in semantics between SELF and C; I kind of wonder what the point of this apples-to-oranges comparison is. This paper also made me wonder about the relationship between designing and implementing a language. Allegedly SELF has all this funky message passing business going on because it makes the language simpler (is that true?); even if it were true, when might be worth it to trade language performance for expressiveness? When do we want a language that might be easy to program, but is slower? Where might we draw a line here? Should we design language semantics with an implementation in mind? |
Beta Was this translation helpful? Give feedback.
-
The paper seems to downplay benefits of static typing. It feels like there is a trade off regarding language expressiveness. These techniques claim to have efficiency without compromising the language but it is a bit skeptical. It is hard to ignore cases where the design architecture calls for object structures that may change at runtime. How could this trade off from implementation flexibility to development difficulty be measured? Is there a point where the overhead becomes to large to deem worthy? |
Beta Was this translation helpful? Give feedback.
-
This was a fun read! As others have mentioned it took me a while to get through the paper and in particular section five, it's been a while since I last saw smalltalk and pure message passing style code and it took me a while to get familiar with the examples that were presented in the paper. Personally, I love types (and while not an enthusiastic fan of statically typed programming languages i'm more of the perspective that they provide good enough benefits that I can deal with their quirks). I know that they are sometimes annoying to work with and the compiler sometimes get in the way of things just working (like writing code that I know will not throw a null pointer exception and still have to fight my way with the Kotlin compiler to let it know that I promise this field won't be null and if it is, it's okay to throw an exception). But they also enable us to model systems in a way that can make reasoning about behavior of a system more easily, and it can help you catch silly mistakes quickly and without having to actually run the program. From the paper I got the sense that there is an argument that statically typed systems are more for the benefit of the programmers and that the runtime doesn't need much type information to be efficient. And while I understand where this perspective comes from, I believe that types can also be pretty useful for optimization purposes; like we saw on TBAA and on the discussion in class about mutable vs immutable types and the optimizations that they enable us to run. I think that types are useful for humans, but I also think that they can help optimize programs. Discussion questionStatic variable scoping has become the norm for most programming languages as it can help to write programs that are easier to reason about, and it is now uncommon to find languages that use dynamic variable scoping; but the same is not true for statically typed vs dynamically typed programming languages. It seems that we've been able to build large complex systems in both of them, and there are even some languages that are mainly statically typed but allow dynamic types in some areas (like C# which helps avoid visitor patterns by using dynamic types). It also seems that we can build performant systems in both paradigms. Is that how we expect type systems to continue evolving, with a mix of both static and dynamic typing? Or is there a future in which one of them tips the balance (like static scoping did)? Perhaps there's a middle where better type-inference systems help move the needle one way or another? |
Beta Was this translation helpful? Give feedback.
-
Critique: Like others mentioned, some of the concepts and references in this paper were a bit difficult to follow, which made it hard to fully appreciate some of the points and contributions being made. The background on SELF was helpful though, and there were some cool ideas like the tagged pointer representation that seemed pretty clever. I think it was cool to read about how the SELF compiler tackles challenges arising from the lack of static types and message passing structure for accessing variables, like using message inlining to avoid overhead from sending certain messages. Some of the sections, like Section 5.4 on message splitting and Section 5.5 on type prediction, felt just like simple extensions of existing compilation strategies (optimizing the hot path), unless these ideas were not as common when this paper was written. Discussion Question: The paper discusses many interesting compilation techniques and clever workarounds in a context that is specific to SELF and its compiler. However, some ideas, like message inlining, seem valuable for other languages with dynamic dispatch like virtual method calls. What are some other ideas from this paper that could inspire optimizations in modern dynamic language runtimes, even if they don’t follow the SELF or prototype model strictly? |
Beta Was this translation helpful? Give feedback.
-
Like others, I wasn’t familiar with SELF or prototype-based languages, so this paper was difficult to understand. One of their arguments I’m not sure I fully agree with is their claim that “Researchers seeking to improve performance should improve their compilers instead of compromising their languages.” While the idea of preserving flexible languages is appealing, I think the downside is that it places a huge burden on compiler infrastructure, which can become overly complex and hard to maintain. Not every project has the resources to build or rely on advanced compilers, and pushing all optimization responsibility into tooling seems like a bottleneck. I think small compromises in the language, like restricted dynamic features can make it easier to write correct and fast code, rather than relying on the compiler to figure everything out. The paper highlights SELF’s dynamic inheritance (the ability for objects to change their parents at runtime) as a useful object-oriented feature. At the same time, this flexibility allows for security vulnerabilities like prototype pollution, so is the flexibility of dynamic inheritance worth the risk of enabling such bugs? |
Beta Was this translation helpful? Give feedback.
-
Critique:This paper was among the earliest efforts to implement a dynamically-typed OOP language. I appreciate its impact on later jit design in java vm and other prototyped-base language, like javascript. Their emphasis on the simplicity of design and usage of message passing somehow reminds me of microkernel, which also adopts IPC as the main components of syscalls. Simple design does make stuff easier to reason about but their level of abstraction may put extra burden on the user of that tool. The concept of map they introduced to save the memory usage per object reminds me of the key-sharing dictionary PEP-412 for cpython, which is proposed to share common portion of Discussion QuestionWhat's the difference between prototype-based and class-based design. How they solve the common problems in OOP, like inheritance, polymorphism etc. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Discussion thread for An Efficient Implementation of SELF, a Dynamically-Typed Object-Oriented Language Based on Prototypes
Discussion leads: @ananyagoenka @smd21
Beta Was this translation helpful? Give feedback.
All reactions