To permit more flexible security policies, we adapt compile-time control-flow and data-flow analysis techniques to estimate an information flow graph for an applet. A path from input i to output j indicates that output j may depend on input i. The absence of a path from input i to output j guarantees that output j is independent of the value of input i. With this graph, we can control the applet's access to local resources. If no path exists from local disk to the network in an applet's information flow graph, the applet can manipulate local data, yet is prevented from leaking sensitive data to the outside world.
In this talk, I will describe a polyvariant control-flow and data-flow analysis framework suitable for advanced languages like Scheme, ML, and Java. I will present experimental results that quantify the effectiveness of different polyvariance strategies, as applied to classical compiler optimizations. While value flow graphs computed according to this framework correctly capture value flow for the purposes of compiler optimizations, they do not capture information flow as required for security analysis. I will show how to adapt this framework to compute information flow graphs, and sketch the correctness theorem that assures security.
We have designed a special-purpose programming language, QCM, for defining security policies on the Internet. QCM programs maintain a database distributed among the nodes of a loosely-coupled network, and interact by exchanging digitally-signed database queries. The database primitives of QCM naturally support essential security functions such as the definition, distribution, and use of access control lists and public key directories. QCM has a formal semantics based on logic and automata theory which may be used to prove that programs implement desired security policies.
Joint work with Carl A. Gunter.
The demands of a peristent, multi-user, user-extensible, network space required solutions for several tricky technical problems in Java. In particular, Matrix supports the notion of resource control (no player may create an unbounded number of objects or threads); coherent snapshotting (the state of the system can be stored out to disk for backup) even in the presence of multiple threads; fine-grained object-level security (e.g., the constructions of one participant cannot be destroyed by another participant); and incremental meta-mutation (the structure of classes and objects may be changed ``on the fly'' without shutting down and restarting the server). In addition, remote method invocation (RMI) based server/client interconnections are supported, paving the way to a peer-peer model for Matrix, the basis for a planned distributed architecture.
A novel aspect of the design of Matrix is that (pure) Java is both the implementation language and the programming language of Matrix --- the Java compiler and Virtual Machine are used to compile and execute Matrix programs. Extensive use is made of Java 1.1 APIs such as Object serialization, RMI, and reflection. Java's Class Loading facilities proved key to providing modified access to the core class libraries (e.g. java.lang.{Object,Thread}) so as to support the computational facilities described above.
(This work is being done in collaboration with Chris Rath and Jim Wilson of AT&T Labs.)
Even though SML program's control flow and exception flow are in general mutually dependent, analyzing the two flows are safely decoupled. Program's control-flow is firstly estimated from a set of equations defined by simple case analysis of call expressions. For cases where exceptions carry functions (i.e., where control flow analysis needs exception analysis) our control flow analysis uses a crude approximation (by types) to assure its safety against the decoupling. Using this call-graph information, program's function-level exception flow is derived as set-constraints, whose least model is our analysis result. Both of these two analyses are proven safe and the reasons behind each design decision are discussed. A preliminary implementation shows a promising cost-accuracy performance. For the ML-Lex program, for example, the analysis takes 4.58 seconds and it reports 4 may-uncaught exceptions, among which 3 exceptions can really escape. Our final goal is to make the analysis overhead less than 10% of the compilation time (compiling the ML-Lex takes 6 to 7 seconds) and to analyze modules in isolation.
Joint work with Sukyoung Ryu.
In this talk, we propose two notions of behavioral subtyping for arbitrary abstract data types in the prescence of mutation and aliasing. Strong behavioral subtypes have objects that act like supertype objects in all cases, whereas as weak behavioral subtypes have objects that only need to act like supertype objects when manipulated as supertype objects. Both these notions allow sound modular reasoning based on the static types of variables in programs. Weak behavioral subtyping allows conclusions about programs based on the effects of individual procedures but restricts certain forms of aliasing. Strong behavioral subtyping allows all forms of aliasing but permits conclusions based only on the history constraints of the types. History constraints are the reflexive and transitive properties preserved by objects of a type across different states of a program. We prove that both these behavioral subtype notions are sufficient for sound modular reasoning.