Wednesday, September 24, 2008

The Omnidebugger

We intend to use a number of different programming languages in the STEPS project. Even right now, if we only look under the hood (in what we call the "engine room"), we're already using three: Pepsi, Coke, and an OMeta-like language for parsing. Coke plays a special role among these languages, since the semantics of the other languages are defined via translation into Coke.

One thing programmers—especially Smalltalkers—really care about is debugging. Coke has recently started to get some debugging capabilities, and we can expect that these will soon be on par with (hopefully even better than) Smalltalk's. This is definitely a good thing, but we can't stop there; the other languages also need support for debugging, and making this work in under 20K LOC is not a trivial task.

Consider a JavaScript implementation on top of Coke, for instance. Our translation might map a single JavaScript statement to a group of three or four Coke expressions that must be evaluated in sequence. So if we use the Coke debugger on the code generated for a JavaScript program, its notion of “single-stepping” won’t make sense at the JavaScript source level. Similarly, inspecting the temporary variables on the stack won’t work unless JavaScript’s temps are represented directly as Coke temps, which may not be the case. Things are even worse for languages whose semantics are significantly different from Coke’s. For example, a debugger for Prolog should support all kinds of features (e.g., unification) that don’t really make sense in the Coke debugger.

The “conventional” way around these problems would be to implement a separate debugger for each language, which clearly isn’t good enough for STEPS. But what if we went with a kind of pluggable debugger architecture that allows each “Language X”-to-Coke translator to associate inspecting and debugging functionality, along with all kinds of useful meta-data, with the Coke parse trees it generates? (The Coke compiler would of course have to maintain these associations when it converts the parse trees to code.) This would enable a single debugger implementation, including its GUI, to customize itself to the language that is being debugged. It would also enable programmers to debug the same piece of code at different levels of abstraction (e.g., at the JavaScript level, hiding all “scaffolding” or at the Coke level, in gory detail), which would be extremely valuable to language implementers.

Scheme programmers often implement little DSLs using macros, so I hope that we can get some inspiration from Dave Herman's debugging library for PLT Scheme, which seems very interesting.

No comments: