The theory of so-called pure type systems makes precisely such an identification. This paper describes a new typed intermediate language based closely on a particular pure type system, the lambda cube.
One of our goals is that the language should be neutral with respect to the question of strictness/laziness. That is, it should be possible to use it as an intermediate language for compiling both strict and lazy languages, without penalising either. Since the framework to do this is monadic, we expect also to be able to accommodate richer monads such as state and exceptions. If we are successful in this, then the language may provide a common framework in which some hitherto rather separate implementation communities can talk to each other.
This is joint work with Nevin Heintze.
Stalin performs type inference using set-based analysis (SBA aka 0CFA). This analysis is augmented to support polyvariant procedure splitting. The results of SBA are used to reduce run-time type checking and dispatching. The results of SBA are also used to do low-level representation selection on a per-expression basis. This has two benefits. First, type tags can be eliminated for monotypes, allowing the use of base machine representations for primitive data. Second, boxing can be eliminated, alleviating the costs of indirection, allocation, and reclamation associated with boxing. Eliminating boxing requires that the run-time organization allow variables, parameters, structure slots, and vector slots to have different widths depending on the type of data that they hold. Furthermore, user-defined structures can be unboxed only if those structures are immutable. SBA is extended to determine data widths and mutability at compile time.
SBA is also extended to do compile-time lifetime analysis for all storage allocation operators. This allows Stalin to adopt a run-time organization based on stack allocation rather than garbage collection. Storage is divided into a set of separate heaps, one per procedure. Each heap is managed as a stack. The free pointer for the heap associated with a procedure is saved upon entry to that procedure and restored upon exit. Each storage allocation operator in the program is assigned, at compile-time, to allocate storage on a particular heap. It is safe to allocate an object on the heap of a procedure if the object can be created and accessed only while the activation record of that procedure is still active. The compiler must determine on which heaps it is safe for a given operator to allocate storage. From these, it can choose the one with the shortest lifetime. With this scheme, an arbitrary amount of such storage can be reclaimed in a small constant-time operation without scanning memory or copying data, simply by restoring a free pointer.
SBA is also extended to determine the lifetime of continuations at compile time. This allows tailoring the implementation of CALL/CC. When a continuation can be called only when the activation record of the creator is still active, setjmp/longjmp can be used. When, in addition, the continuation call is inlined in the creator, a goto can be used. When a continuation can be called after the creator returns, the program is converted to CPS. Since CPS code can be less efficient than non-CPS code, Stalin determines which expressions and procedures can be reentered as a result of a continuation call and converts only those expressions and procedures to CPS.
The results of SBA are used to support numerous other optimizations including: (a) generating direct procedure calls and direct references to global variables, (b) lambda lifting and closure conversion, (c) constant propagation, (d) globalization of non-reentrant free variables, (e) procedure integration (i.e. inlining), and (f) efficient handling of variable-length argument lists.
Stalin has been tested on a suite of benchmarks whose length ranges up to a thousand lines. Stalin outperforms Scheme->C, Gambit-C, Chez, SML/NJ, and even handwritten Ada, Pascal, Fortran, and C code, on most of these benchmarks.
We present a new algorithm for standard control-flow analysis of ML programs whose types have bounded size. Our algorithm consists of a linear-time processing phase, following by a query phase in which queries about the possible calls from a call site can be answered in linear-time. This algorithm is simple, incremental, demand-driven and easily adapted to polyvariant usage. Early experimental evidence suggests that this new algorithm is significantly faster than the standard algorithm.
We argue that this algorithm provides the basis for linear-time processing of control-flow information in ML compilation. In particular, we argue that the standard problem "for each call site, output the list of functions that can be called from this site" (which has an O(n^2) complexity since there is O(n^2) information must be output) is not directly relevant to most compilation applications. More important are questions *derived* from control-flow information such as: "How many functions are called from each call site?" or "Which call sites call only one function?" or "Which expressions may have an effect?". We show that our algorithm can be adapted to give linear-time algorithms for these questions.
This is joint work with David McAllester.
This talk presents a new flexible representation analysis technique that combines the best of both approaches. Our new scheme supports unboxed representations for recursive and mutable types, yet it only requires little runtime type analysis. In fact, we show that there is a continuum of possibilities between the coercion-based approach and the type-passing approach. By varying the amount of boxing and the type information passed at runtime, a compiler can freely explore any point in the continuum---choosing from a wide range of representation strategies based on practical concerns. Finally, our new scheme also easily extends to handle type abstractions across ML-like higher-order modules.