NJPLS at Rutgers University

September 30, 2016

This instance of NJPLS will take place at Rutgers University.

All are welcome and there is no registration fee. However, if you are planning to attend, you must register in advance by filling out this form.

Travel instructions may be found here.

Agenda

9:00am Breakfast

9:30am
Checked C
David Tarditi, Microsoft Research
Top-to-bottom verification of well synchronized C programs on weakly consistent machines
Santiago Cuellar, Princeton University

10:30am Coffee

11:00am
ApproxHadoop: Approximation with MapReduce Programs
Thu Nguyen, Rutgers University
Verifying Constant Time Implementations
Michael Emmi, Nokia Bell Labs

12:15pm Lunch

1:45pm
Vertical Composition of Reversible Atomic Objects
Eric Koskinen, Yale University
A User Guided Approach to Program Analysis
Xin Zhang, Georgia Tech
Analyzing Build Systems with Partial Makefile Evaluation
Paul Gazillo, Yale University

3:15pm Coffee

3:30pm

Loopy: Programmable and Formally Verified Loop Transformations
Nimit Singhania, University of Pennsylvania
Executable Categorical Models of Type Theory
Gershom Bazerman, S P Global Market Intelligence
Visualization of a Dynamic Analysis for Finding Android Access Control Bugs
Kristopher Mcinski, University of Maryland

This meeting is organized by Santosh Nagarakatte.

Abstracts

Checked C

David Tarditi, Microsoft Research

This talk will present the Checked C research project, which is investigating how to extend the C programming language so that programmers can write more secure and reliable C programs. The project is developing an extension to C called Checked C that adds checking to C to detect or prevent common programming errors such as buffer overruns, out-of-bounds memory accesses, and incorrect type casts. The extension is designed to be used for existing system software written in C.

Top-to-bottom verification of well synchronized C programs on weakly consistent machines

Santiago Cuellar, Princeton University

Well synchronized programs should not have data races, and pro- grams whose sequentially consistent executions have no data races should run on weakly consistent multiprocessors as if they were sequentially consistent. The former claim comes from Dijkstra and Hoare through formalizations by O’Hearn and Brookes; the latter was stated by Saraswat et al. as a desideratum for compilers and multiprocessors.

We prove these claims for Concurrent Separation Logic, the CompCert C compiler, and the Intel x86 (IA-32) computer architec- ture. C programs provable in Concurrent Separation Logic, when compiled with this optimizing C compiler and run on that machine, will have the (sequentially consistent) behavior proved at the source level. We prove it in a modular way: a theorem about the soundness of CSL with respect to the operational semantics of C (and our 5 synchronization functions); a theorem about the correctness of the C compiler; a theorem about the Intel x86-TSO architecture.

ApproxHadoop: Approximation with MapReduce Programs

Thu Nguyen, Rutgers University

Verifying Constant Time Implementations

Michael Emmi, Nokia Bell Labs

The constant-time programming discipline is an effective countermeasure against timing attacks, which can lead to complete breaks of otherwise secure systems. However, adhering to constant-time programming is hard on its own, and extremely hard under additional efficiency and legacy constraints. This makes automated verification of constant-time code an essential component for building secure software.

We propose a novel approach for verifying constant time security of real-world code. Our approach is able to validate implementations that locally and intentionally violate the constant-time policy, when such violations are benign and leak no more information than the public outputs of the computation. Such implementations, which are used in cryptographic libraries to obtain important speedups or to comply with legacy APIs, would be declared insecure by all prior solutions.

We implement our approach in a publicly available, cross-platform, and fully automated prototype, ct-verif, that leverages the SMACK and Boogie tools and verifies optimized LLVM implementations. We present verification results obtained over a wide range of constant-time components from the NaCl, OpenSSL, FourQ and other off-the-shelf libraries. The diversity and scale of our examples, as well as the fact that we deal with top-level APIs rather than being limited to low-level leaf functions, distinguishes ct-verif from prior tools.

Veritical Composition of Reversible Atomic Objects

Eric Koskinen, Yale University

The classic Herlihy/Wing notion of concurrent objects has had great success in theories and implementations (e.g. java.util.concurrent), providing programmers with the simple abstraction of an atomic object. Since then, software transactions have appeared, also touting the the goal of providing an atomicity abstraction. However, despite some vertical composition strategies within particular STM implementations, a fundamental concept of vertical composition has remained elusive.

In this talk, I will present our recent results in which we distill the essence of vertical composition, with the notion of reversible atomic objects. By restricting occurrences of transactions to the method boundary and requiring that every object method construct its own inverse, we obtain a cleaner semantics that supports vertical composition. In fact, we do not even require that one layer use the same implementation (e.g. pessimism versus optimism) as another, nor that the object be transactional at all. Formally, we begin with a semantics in which abstract-level operations are composed from constituent base operations, accounting for conflict and inverses. These transactional implementations are put in the context of an environment that includes a novel a deadlock-mitigating contention manager that ensures progress. The contention manager may, at any point, apply inverses on behalf of a currently executing transaction. Our work has the first proof that programs composed with implementations in this framework are a contextual refinement of the same program instead composed with atomic specifications and that layers can be composed vertically.

Our compositional treatment in terms of a single shared log gives rise to a novel transactional variant of the universal construction. I will discuss our library of reversible atomic objects, demonstrating that it is easy and intuitive to build complex concurrent implementations (e.g. a multi-threaded layered filesystem) by vertical composition of atomic components. This is joint work with Paul Gazzillo, Timos Antonopoulos, & Zhong Shao

A User-Guided Approach to Program Analysis

Xin Zhang, Georgia Tech

Program analyses often produce undesirable output due to various approximations. We present a novel approach that allows user feedback to guide such approximations towards producing the desired output. We formulate the problem of user-guided program analysis in terms of solving a combination of hard rules and soft rules: hard rules capture soundness while soft rules capture degrees of approximations and preferences of users. We solve the rules in a manner that is sound (satisfies all hard rules), optimal (maximally satisfies soft rules), and scales to sophisticated analyses and complex programs. We demonstrate the effectiveness of our approach on diverse analyses expressed in Datalog on large, widely-used Java programs.

Analyzing Build Systems with Partial Makefile Evaluation

Paul Gazillo, Yale University

Collections of Makefiles are frequently used to specify the build-time configuration of software projects. The Linux kernel, for example, has over 14,000 configuration variables and 20,000 lines of Makefile code that specify the kernel build units. These Makefiles are written in a macro-like language and must be evaluated to obtain a given build configuration specification. Given this complexity, it is not surprising that bugs lurk in these Makefile collections resulting in invalid configurations once expanded, and that evaluating Makefiles over all configurations is intractable.

In this talk we introduce a technique for automatically analyzing configuration Makefiles. This is the first analysis capable of obtaining the complete enumeration of all possible Linux kernel compilation units and their configuration properties, critical information for project-wide software engineering tools such as bug-finders, refactorings, and code browsers. The key challenge is that Makefile programs rely heavily on meta string manipulation: Even variable names can be manipulated as strings at runtime, making static analysis difficult. Our solution is a form of partial evaluation, combining symbolic and concrete execution. The algorithm evaluates statements symbolically but unfolds an expression to its many concrete values, balancing tractability with meaningful results. Variables are treated symbolically when used in predicates, enabling efficient handling of configuration properties, but treated concretely when used as string values, producing useful program state.

We have implemented our approach as a new tool called Kmax and evaluated it on the Linux kernel build system. Kmax is the first tool that is able to find all kernel compilation units and their configuration properties. Our results show that Kmax is precise---previous tools miss over a thousand files compared to Kmax---and performant, taking only about a minute and a half for the x86 version of the Linux kernel, only two times slower than the fastest compared tool.

Loopy: Programmable and Formally Verified Loop Transformations

Nimit Singhania, University of Pennsylvania

Executable Categorical Models of Type Theory

Gershom Bazerman, S P Global Market Intelligence

We present work in progress on an approach to the problem of developing strongly typed embedded languages in Haskell in conjunction with type-safe interpreters and normalizers. Our approach is to follow the logic of the Curry-Howard-Lambek correspondence, proceeding from categorical (presheaf) models of typing judgments. We then render these models directly into Haskell via an "arrows-first" encoding using Generalized Algebraic Data Types. This yields succinct, type-safe specifications with extremely compact interpreters and normalizers. Crucial use is made of the representation of exponentials in presheaf categories as a way to capture higher-order behavior. With regards to variable binding, this approach "covers" both nested abstract syntax (type-safe De Bruijn notation) and higher-order abstract syntax, yielding some insight into their origin and relationship.

Visualization of a Dynamic Analysis for Finding Android Access Control Bugs

Kristopher Mcinski, University of Maryland

9:00am	Breakfast
9:30am	Checked C David Tarditi, Microsoft Research Top-to-bottom verification of well synchronized C programs on weakly consistent machines Santiago Cuellar, Princeton University
10:30am	Coffee
11:00am	ApproxHadoop: Approximation with MapReduce Programs Thu Nguyen, Rutgers University Verifying Constant Time Implementations Michael Emmi, Nokia Bell Labs
12:15pm	Lunch
1:45pm	Vertical Composition of Reversible Atomic Objects Eric Koskinen, Yale University A User Guided Approach to Program Analysis Xin Zhang, Georgia Tech Analyzing Build Systems with Partial Makefile Evaluation Paul Gazillo, Yale University
3:15pm	Coffee
3:30pm	Loopy: Programmable and Formally Verified Loop Transformations Nimit Singhania, University of Pennsylvania Executable Categorical Models of Type Theory Gershom Bazerman, S P Global Market Intelligence Visualization of a Dynamic Analysis for Finding Android Access Control Bugs Kristopher Mcinski, University of Maryland