NJPLS at Princeton

NJPLS

Friday, Oct. 20, 2006

Agenda

Arrival and Coffee: 9:30 - 10:00

Session 1: 10:00 - 10:45
Patricia Johann (Rutgers University)
Folds, Church Encodings, Builds, and Short Cut Fusion for Nested Types: A Principled Approach

Break: 10:45 - 11:00

Session 2: 11:00 - 12:30
Frances Perry (Princeton University)
Expressing Heap-shape Contracts in Linear Logic

Radu Rugina
Demand-Driven Alias Analysis for C Programs

Lunch: 12:30 - 1:50

Business meeting: 1:50 - 2:00

Session 3: 2:00 - 3:30
Saurabh Srivasta
CMod - A module system for legacy C programs

Chris Richards
A Very Modal Model of a Modern, Major, General Type System

Abstracts

Patricia Johann (Rutgers University)

Folds, Church Encodings, Builds, and Short Cut Fusion for Nested Types: A Principled Approach

Abstract: Initial algebra semantics is one of the cornerstones of the theory of modern programming languages. It provides support for fold combinators encapsulating structured recursion over data structures, thereby making it possible to both reason about and transform programs in principled ways. The usual initial algebra semantics has recently been extended to support not only standard fold combinators for inductive types, but also Church encodings and build combinators for them. In addition to being theoretically useful in ensuring that build is seen as a fundamental part of the basic infrastructure for programming with inductive types, extended initial algebra semantics has practical merit: the fold and build combinators can be used to define fold/build rules which optimize modularly constructed programs by eliminating intermediate inductive data structures.

In this talk we apply extended initial algebra semantics to the theory and practice of programming with nested data types, using it to simplify the theory of folds for nested types, as well as to provide the first Church encodings, build combinators, and fold/build fusion rules for such types. There is, of course, no a priori reason to expect initial algebra semantics to be sufficiently expressive to fully support structured programming with nested data types; in fact, the entire development of generalised folds for them is based on the widely accepted notion that it is not. But our development turns this conventional wisdom on its head. Moreover, it requires no new conceptual machinery whatsoever: we simply deploy extended initial algebra semantics in a functorial setting. Our approach thus provides a suitably expressive and elegant foundation for programming with nested types.

Frances Perry (Princeton University)

Expressing Heap-shape Contracts in Linear Logic

Abstract: Contracts (dynamically checked programmer assertions) are a widely accepted mechanism for specifying, checking and documenting properties of software components. Most, if not all, contract systems expect programmers to use the native programming language to express their program invariants. While this is most effective for many simple invariants, expressing properties of data structures and aliasing patterns can be extremely complicated. If written in the native language in an unstructured way, such contracts are bound to be unclear and ineffective as documentation.

In this talk, we show how to use linear logic as a language of contracts for an imperative programming language. The high-level nature of our linear logical contracts makes specifying memory shape and aliasing properties of complex recursive data structures easy. Moreover, since we give our logic a clear, compositional semantics, the contracts serve as effective, executable documentation for programmer expectations. In order to evaluate the truth of our linear logical contracts at run time, we use a modifed version of LolliMon, a linear logic programming language.

This is joint work with Limin Jia and David Walker.

Radu Rugina (Cornell University)

Demand-Driven Alias Analysis for C Programs

Abstract: Static disambiguation of indirect memory references is a fundamental analysis problem in pointer-based languages. Knowledge about memory aliases is required by virtually all program analyses, optimizations, or transformations. Traditionally, the alias problem has been solved indirectly, by reducing it to the points-to analysis problem. More precisely, alias queries have been answered by computing and intersecting points-to sets.

In this talk I will present a demand-driven, flow-insensitive may-alias analysis for C programs designed to answer alias queries directly and more efficiently, without computing points-to sets. The alias problem is formulated as a CFL-reachability problem over a canonical representation of program expressions and assignments. Unlike existing pointer analyses, our approach does not require transforming the program to a three-address form. The algorithm controls the amount of exploration per query using a predefined analysis budget. Our experiments show that a large fractions of the alias queries can be answered accurately using a low analysis budget, and that our approach is more efficient than building full points-to sets.

Saurabh Srivasta (University of Maryland, College Park)

CMod - A module system for legacy C programs.

Abstract: The notion of a module is an integral component of a modern programming language. Modules allow for splitting of complex programs into manageable components. The C programming language is used to build many such programs, yet it does not provide direct support for modules. Instead, C programmers have over time developed conventions that mimic more complicated linguistic support provided by other languages. The convention is to treat .h header files as interfaces and .c source files as module implementations. This convention can be effective, but there are many subtleties in using it correctly, and the lack of compiler and linker support adds complications. As a result, misuse of the conventions can lead to hard-to-find bugs, can make reasoning about code in isolation more difficult, and can complicate future code maintenance.

This talk presents CMod, a module system for C that ensures abstraction via information hiding and type-safe separate compilation. Instead of constructing a module system from scratch and constraining programming to code within its guidelines, we retrofit modular programming over existing practices. Our approach is to identify and enforce the rules which make C's current modular programming conventions sound. We have proven that the four rules that we propose are sufficient to guarantee the desired modularity properties. We have implemented CMod for the full C language and applied it to a number of benchmarks. We found that most of the time legacy programs obey CMod's rules, or can be made to with minimal effort, and rule violations often result in type errors or brittle code. Thus CMod brings the benefits of modular programming to C while still supporting legacy systems.

We are also working on extending the results of CMod to checking modularity properties under all configurations, where a configuration is defined to be a set of preprocessor flags which determine conditional compilation.

This is joint work done with Mike Hicks, Jeff Foster and Bhargav Kanagal at the University of Maryland, College Park. More information available at http://www.cs.umd.edu/~saurabhs/CMod

Chris Richards (Princeton University)

A Very Modal Model of a Modern, Major, General Type System

Abstract: We present a clean and powerful model of recursive and impredicatively quantified types with mutable references. We interpret in this model all of the type constructors needed for typed intermediate languages and typed assembly languages used for object-oriented and functional languages. We establish in this purely semantic fashion a soundness proof of the typing systems underlying these TILs and TALs---ensuring that every well-typed program is safe. The technique is generic, and applies to any small-step semantics including lambda-calculus, labeled transition systems, and von Neumann machines. It is also simple, and reduces mainly to defining a Kripke semantics of the Goedel-Loeb logic of provability. We have machine-checked proofs in Coq.

This is joint work with Andrew Appel, Paul-Andre Mellies, and Jerome Vouillon

Stephanie Weirich put together the schedule.