The second-order polymorphic lambda calculus, F2, was independently discovered by Girard and Reynolds. Girard additionally proved a representation theorem: every function on natural numbers that can be proved total in second-order intuitionistic propositional logic, P2, can be represented in F2. Reynolds additionally proved an abstraction theorem: for a suitable notion of logical relation, every term in F2 takes related arguments into related results. We observe that the essence of Girard's result is a projection from P2 into F2, and that the essence of Reynolds's result is an embedding of F2 into P2, and that the Reynolds embedding followed by the Girard projection is the identity. The Girard projection discards all first-order quantifiers, so it seems unreasonable to expect that the Girard projection followed by the Reynolds embedding should also be the identity. However, we show that in the presence of Reynolds's parametricity property that this is indeed the case, for propositions corresponding to inductive definitions of naturals, products, sums, and fixpoint types.
Hancock is a C-based domain specific programming language designed to facilitate writing highly efficient but easily maintainable programs for analyzing high-volume streams of transaction records, especially those that produce and consume large amounts of interrelated persistent data. Hancock provides a persistent data system to help manage such data. This system includes three different persistent data types: maps store large collections of keyed data, pickles allow programmers to define custom persistent structures, and directories group collections of persistent data into a single entity. Initializing declarations allow programmers to connect in-memory program variables with on-disk representations of persistent data structures. Persistent types may be parameterized so that they can be customized at instantiation time. Collectively, these mechanisms define a persistence system that is efficient, customizable, transparent, and helps protect against inadvertent data corruption. In this talk, I will give a brief overview of Hancock and then describe its persistent data system in more detail.
Hancock is joint work with Anne Rogers, Fred Smith, Corinna Cortes, Daryl Pregibon, and Karin Hogstedt.
Moby is an experiment in language design and implementation. Its design combines ML-style modules and functional programming constructs with class-based object-oriented features and CML-style concurrency.
In this talk, I will start with a brief report on the status of the Moby project. The rest of the talk is focused on the design and implementation of Moby's concurrency mechanisms. Moby provides an explicit threading model with message-passing primitives for communication and synchronization. Like CML, Moby supports first-class synchronous operations, called events, which are an abstraction mechanism for building first-class communication and synchronization abstractions. This talk will describe Moby's concurrency mechanisms and will illustrate how the event mechanism can be used to package complex communication protocols as synchronous operations. I will also discuss the implementation of concurrency in Moby, including compiler and runtime support.
The Moby project is a joint effort with Kathleen Fisher (AT&T Labs --- Research).
This talk describes a systematic method for optimizing recursive functions using both indexed and recursive data structures. The method is based on two critical ideas: first, determining a minimal input increment operation so as to compute a function on repeatedly incremented input; second, determining appropriate additional values to maintain in appropriate data structures, based on what values are needed in computation on an incremented input and how these values can be established and accessed. Once these two are determined, the method extends the original program to return the additional values, derives an incremental version of the extended program, and forms an optimized program that repeatedly calls the incremental program. The method can derive all dynamic programming algorithms found in standard algorithm textbooks. There are many previous methods for deriving efficient algorithms, but none is as simple, general, and systematic as ours.
This is joint work with Scott Stoller. A paper describing this work appeared in PEPM'02.
Profiling programs to observe their dynamic behavior is important for a variety of applications, including program understanding, performance monitoring, and profile-guided optimizations. One approach for collecting profiling information is to insert instrumentation into the code to explicitly record the desired information. Unfortunately, this approach is often impractical because it can introduce substantial runtime overhead.
In this talk I describe a general mechanism for reducing the overhead of collecting instrumented profiles. Our approach transforms the code in such a way that execution can easily transfer back and forth between instrumented and uninstrumented code, thus allowing ``instrumentation sampling''. Our technique does not rely on any hardware or operating system support, yet provides a high frequency sample rate to allow collecting accurate profiles.
Our technique is implemented in the Jikes Research Virtual Machine and is being used to collect instrumented profiles for the purpose of driving feedback-directed optimizations. Our experimental results show that accurate profiles can be collected (93-98% overlap with a perfect profile) with low overhead (averaging ~3%).