How we talk when we talk about programming languages
Posted by Elf Sternberg as programming
Last year, I worked my way through Lisp In Small Pieces, implementing several variants of the Lisp engine Queinnec described, most of them in Coffeescript. I’ve decided this year that I’m going to continue building out my language experience and write a scripting language. I’m still working out the details, but I’m working my way now through Scott’s Programming Language Pragmatics while reading a lot of extra stuff on the side.
One of the gems I found recently was a copy of the Smalltalk-80 Implementation Guide. It is, to say the least, a fascinating book. It introduces a virtual machine in the context of programming in 1980, which was 37 years ago, and the assumptions the authors make as to what I’m expected to know as a potential developer of a Smalltalk environment fascinated me.
The book describes a stack-based, tree-walking virtual machine. As Smalltalk is a purely object-oriented language, the tree is completely reified by the OO environment; you choose an object and a function to start, and the interpreter walks that function’s statements and subroutines until execution ends at the end of the start function. Methods are dispatched according to a lookup table, and primitives are encapsulated in a large switch() statement.
What fascinates me about reading this is that I know, at least theoretically, so much more about writing VMs. The VM described here is a nightmare of cache misses and branch prediction failures; fully a third of a modern CPU’s efforts will be going into loading, evaluating, and then disposing of anticipated operations that the VM will throw away unused. There’s no mention at all of JITting the program’s functions (converting long sequences directly into machine code) or even just its spine (writing the sequence of primitives as indirect, or even direct, unconditional branches such that the CPU’s branch predictor works most of the time; direct is cool but requires the primitives themselves be copied and modified for direct uncoditional branching back to the spine).
Since in Smalltalk almost any object could be a launching point and since relationships between objects can be changed with the swipe of a mouse, the speed with which we’d have to recalculate any given set of relationships would have to be a priority. It would have to have Go’s internal speed of recompiling, only to put the results into an executable anonymous mmap()’d space, and then map everything than needs it to be able to find that code.
A modern Smalltalk VM could be a marvel to use. It would be a nightmare to write.
Be that as it may, choosing to write programming languages today, even ones on top of a system language like C or Rust, is to confront an embarrassment of riches. There are so many interesting ways to do something now. We have JITs! We have Algorithm W! We have concurrency problems like you wouldn’t believe! 37 years ago, they had two options: compiled code, or a stack-based switch statement on top of compiled code. The writers assumed you knew more about those two ways, and absolutely nothing about all the other stuff.