Many people. Unfortunately, attendence did not get taken. Does this matter?
For the past two years, I have participated in an experiment to build systems software (in particular, a suite of network communication protocols) in an extension of the Standard ML programming language. One result of this experiment is tangible evidence that SML can be used to build practical systems, and that the type theory underlying the language provides many benefits, not just in programming, but also in system design and structuring. This is not to say, however, that the current design and implementation of the language is satisfactory. In this talk, I will describe our motivations for the project, the experiment, our experiences in carrying it out, and the results. Then, I will describe several interesting problems in SML that were exposed in the process. In particular, there are still serious problems in memory traffic (for both code and data), data representations, optimization of loops, and separate compilation that will need to be resolved if languages such as SML will ever be considered practical in the "real world". Some preliminary thoughts on how to approach some of these problems will be presented.
This is joint work with Robert Harper.
We tentatively agreed that the next meeting will be April 6th at Princeton. This meeting will be organized by Andrew Appel. We agreed that having a theme may be not be so important. This will be left up to the chairman organizing each meeting. We also agreed that we need more time for informal discussions at these meetings.
We study the cache performance of a set of ML programs, compiled by the Standard ML of New Jersey compiler. We find that more than half of the reads are for objects that have just been allocated. We also consider the effects of varying software (garbage collection frequency) and hardware (cache) parameters. Confirming results of related experiments, we found that ML programs can have good cache performance when there is no penalty for allocation. Even on caches that have an allocation penalty, we found that ML programs can have lower miss ratios than the C and Fortran SPEC92 benchmarks.
Recently systems which require support for permanence of arbitrary datatypes, in contrast to simple, fixed types like files or database records, have become increasingly important. Examples include persistent programing languages, object-oriented databases and distributed object servers. Persistent heaps are an essential component of such systems. It is critical that persistent heaps provide maximum protection of data integrity, since the data they store are likely to be valuable and difficult to recreate. Such safety requirements often conflict with the need to provide high throughput, low latency access to data. This conflict may lead to sacrificing safety for performance.
I will discuss some novel implementation techniques which avoid the need to sacrifice safety for performance. Protection of data integrity is supported by three features: transactions, which allow groups of updates to permanent data to be done atomically; garbage collection, which provides freedom from storage leaks and memory corruptions due to prematurely freeing data; and orthogonal persistence, which frees the programmer from explicitly determining what data should be persistent. A new technique, concurrent replicating collection, allows the use of garbage collection without sacrificing the throughput and latency characteristics of explicit deallocation. The use of replicating collection has resulted in the first implementation of a concurrent collector for a persistent heap. This implementation is embedded in the runtime system of SML/NJ. Replicating collection also facilitates the efficient implementation of orthogonal persistence. I will also present the results of a series of experiments that characterize the performance of my system and compare it to the performance of other systems that do not provide the same level of support for data integrity or equivalent throughput and latency.