New Jersey
Programming Languages and Systems
Seminar

May 10, 2024

New York University

The New Jersey Programming Languages and Systems Seminar (NJPLS) brings researchers in the New Jersey area together for a day of informal talks and discussions. We accept talks on complete research and work in progress, dealing with topics of interest to the programming languages and systems communities.

Registration and talk proposal form.

Talk Proposal Deadline April 10, 2024
Registration Deadline May 6, 2024

Please bring a government-issued ID for entry!

Registration is free thanks to Jane Street!

When Friday, May 10, 2024
Where Warren Weaver Hall 109 · 251 Mercer Street, New York, NY 10012 · Travel Guidelines
What Research talks and discussions. Breakfast, lunch, and coffee provided.

Contact Chaitanya Agarwal (ca2719@nyu.edu) with questions or concerns.


Program

9:00 AM Breakfast and Sign-in
9:55 AM Welcome
10:00 AM Modal Understanding of Protocol Reasoning Ismail Kuru
Drexel University
Abstract

State transition systems (STSes) are an increasingly popular means of specifying and verifying fine-grained concurrent programs. Unlike more traditional rely-guarantee-based approaches, they allow interference to be conveniently treated as a resource, transferred between threads or even stored in other resources. However, existing STS systems leave the traditional Hoare-style rule of consequence weaker than before. Code involving STSes is verified against one particular STS, making it unusable with other, similar transition systems, even when one is contained in the other.

We extend the notion of entailment for STS-based logics to incorporate a form of bisimulation (as on Kripke structures) between STS systems into an extended rule of consequence. This extension allows significant additional code reuse, as we demonstrate through a number of examples. Our formulation is abstract, and relies only upon the existence of a protocol state update rule in the logic, so our soundness proof implies that our new rule is admissible in a range of recent logics, such as CaReSL, and Iris.

10:20 AM Cedar: A language for expressing fast, safe, and fine-grained authorization policies Aaron Eline
Amazon Web Services
Abstract

Cedar is a new open-source authorization policy language, used to express fine-grained permissions on behalf of applications. When an application wishes to authorize an access request, it invokes Cedar's authorization engine which consults its policies to render a decision. Cedar was designed to be ergonomic, fast, safe, and analyzable. Cedar’s simple and intuitive syntax supports common authorization use-cases with easy-to-understand policies. Cedar’s policy structure ensures that access requests can be authorized quickly. Cedar's policy validator leverages optional typing to help policy writers avoid mistakes but not get in their way. Cedar's design has been finely balanced to enable a sound and complete logical encoding, which allows analysts to precisely reason about what policies do, e.g., to ensure that when refactoring a set of policies, the authorized permissions do not change.

Cedar is built using a high-assurance process we call verification-guided development. Its authorization engine and validator are formally modeled in the Lean proof-enabled programming language. Cedar’s core development team proves safety and security properties about those models in Lean, and runs millions of automated differential tests to check that the implementations of the Cedar authorization engine and validator, written in Rust, agree with the Lean models.

10:40 AM KATch: A Fast Symbolic Verifier for NetKAT Mark Moeller
Cornell University
Abstract

Joint work with Jules Jacobs, Olivier Savary Belanger, David Darais, Cole Schlesinger, Steffen Smolka, Nate Foster, and Alexandra Silva:

We develop new data structures and algorithms for checking verification queries in NetKAT, a domain-specific language for specifying the behavior of network data planes. Our results extend the techniques obtained in prior work on symbolic automata and provide a framework for building efficient and scalable verification tools. We present \KATch, an implementation of these ideas in Scala, featuring an extended set of NetKAT operators that are useful for expressing network-wide specifications, and a verification engine that constructs a bisimulation or generates a counter-example showing that none exists. We evaluate the performance of our implementation on real-world and synthetic benchmarks, verifying properties such as reachability and slice isolation, typically returning a result in well under a second, which is orders of magnitude faster than previous approaches. Our advancements underscore NetKAT's potential as a practical, declarative language for network specification and verification.

11:00 AM Coffee
11:20 AM Zero_alloc: Implementing a frontend language feature describing a backend static analysis. Chris Casinghino and Greta Yorsh
Jane Street
Abstract

Development of low-latency software in a high-level language sometimes requires reasoning about whether a given source-level function can allocate. This property can be checked by the compiler, but to be accurate this check should happen very late during compilation, after all optimization passes. This talk discusses challenges designing and implementing a language feature that gives users the ability to reason about this property at the source level in a way that can be accurately communicated to and checked by the backend.

11:40 AM Granite: Functional Meaning for Parallel Streaming Nick Rioux
University of Pennsylvania
Abstract

We study functional programming for long-running parallel computations that stream their results. Our approach to this is to make semilattice structure a first-class language feature. Surprisingly, the result is a programming model with the combined expressive power of Turing-complete functional programming and Datalog-style logic programming. We present a reduction-based streaming semantics and a domain-theoretic denotational semantics which we conjecture is fully abstract.

This calculus serves as the core of Granite, a programming language we are developing. Granite aims to provide great expressivity: programmers have access to both the rich data structures and abstractions characteristic of functional programming as well as the ability to declaratively specify algorithms over cyclic data as in Datalog. Programs written in the language are also parallelizable: they naturally incorporate deterministic parallelism in a manner reminiscent of LVars. Thanks to the CALM theorem, it appears Granite may be a fruitful setting to study not only parallel but also distributed programming.

12:00 PM Inferring Dependent Types and Effects through Abstract Interpretation Mihai Nicola
Stevens Institute of Technology
Abstract

Temporal properties of higher-order programs are often described in terms of effects, which capture program behavior such as inputs/outputs, events, resource usage, computation cost, etc. Some steps have been made towards verifying such properties via reductions to fair termination [Murase et al. 2016] or through deductive proof systems [Nanjo et al. 2018], however currently there are no efficient techniques capable of verifying the wide variety of such properties.

In this talk, I will describe our recent work on an abstract interpretation-based analysis to compute dependent effects of recursive, higher-order programs. We capture properties of a program’s effects in terms of automata that summarizes the history of observed effects using an accumulator register. The key novelty is a new abstract domain for context-dependent effects, capable of abstracting relations between the program environment, the automaton control state, and the accumulator value. The upshot is a dataflow type and effect system that computes context-sensitive effect summaries. We demonstrate our work via a prototype implementation that computes dependent effect summaries (and validates assertions) for OCaml-like recursive higher order programs. As a basis of comparison, we describe a reduction to assertion checking for effect-free programs, and demonstrate that our approach outperforms the prior tools Drift and CoAR/MoCHi.

12:20 PM Lunch
1:40 PM Integrating Logic Rules with Everything Else, Seamlessly Annie Liu
Stony Brook University
Abstract

This talk presents a language, Alda, that supports all of logic rules, sets, functions, updates, and objects as seamlessly integrated built-ins. The key idea is to support predicates in rules as set-valued variables that can be used and updated in any scope, and support queries using rules as either explicit or implicit automatic calls to an inference function.

We have defined a formal semantics of the language, implemented a prototype compiler that builds on an object-oriented language that supports concurrent and distributed programming and on an efficient logic rule system, and successfully used the language and implementation on benchmarks and problems from a wide variety of application domains. We describe the compilation method and results of experimental evaluation.

2:00 PM V-Star: Learning Visibly Pushdown Grammars from Program Inputs Xiaodong Jia
The Pennsylvania State University
Abstract

Accurate description of program inputs remains a critical challenge in the field of programming languages. Active learning, as a well-established field, achieves exact learning for regular languages. We offer an innovative grammar inference tool, V-Star, based on the active learning of visibly pushdown automata. V-Star deduces nesting structures of program input languages from sample inputs, employing a novel inference mechanism based on nested patterns. This mechanism identifies token boundaries and converts languages such as XML documents into VPLs. We then adapted Angluin’s 𝐿*, an exact learning algorithm, for VPA learning, which improves the precision of our tool. Our evaluation demonstrates that V-Star effectively and efficiently learns a variety of practical grammars, including S-Expressions, JSON, and XML, and outperforms other state-of-the-art tools.

2:20 PM An Algorithm Generator for Fixed-point Oriented Programming Henry Blanchette
University of Maryland
Abstract

Iteration is particularly effective at describing algorithms over lists while recursion is effective for trees, but how does one describe algorithms involving cyclic dependencies such as liveness analysis or automata minimization?  The traditional way is to describe these as the fixed point of some set of rules, and then to write a worklist or workqueue algorithm to implement that fixed point.  However, manually implementing fixed points can be difficult, tedious, and brittle.

In this talk, I present an early design for a tool to eliminate this tedium. The tool is a code generation framework that uses high-level specifications to generate workqueue algorithms in a process similar to how one uses a parser generator.  This allows one to focus on the high-level description of the algorithm without worrying about the details of the worklist algorithm and its auxiliary data structures, just as parser generators allow one to focus on the syntax of one's language rather than the details of LR parse-table construction.  In addition, the tool can provide algorithm-level optimizations that would be difficult to implement manually.

2:40 PM Coffee
3:00 PM Constructing Trustworthy Smart Contracts Devora Chait-Roth
New York University
Abstract

Blockchain-based decentralized computation relies on programs called smart contracts. Contracts mediate the transfer of cryptocurrency, making them irresistible targets for hackers, with losses nearing $6B over 2022 and 2023. This paper introduces Asp, a compact language and verification system designed with the aim of easing the construction of provably correct and secure smart contracts. The Asp programming language represents contracts as reactive state machines operating over blockchain-specific abstract data types. The Asp verification procedure validates programmer-supplied deductive proofs of correct and secure behavior. The Asp defensive compiler translates Asp contracts to conventional smart contract languages, introducing checks that enforce verification assumptions. These closely interlinked components work together to simplify the construction of trustworthy contracts.

3:20 PM ZKSMT: A VM for Proving SMT Theorems in Zero Knowledge John Kolesar
Yale University
Abstract

Verification of program safety is often reducible to proving the unsatisfiability of a formula in Satisfiability Modulo Theories (SMT): Boolean logic combined with theories that formalize arbitrary first-order fragments. Zero-knowledge (ZK) proofs allow SMT formulas to be validated without revealing the underlying formulas or their proofs to other parties, which is a crucial building block for proving the safety of proprietary programs. This work presents ZKSMT, a novel framework for proving the validity of SMT formulas in ZK. We design a virtual machine (VM) tailored to efficiently represent the verification process of SMT validity proofs in ZK. Our VM can support the vast majority of popular theories when proving program safety while being complete and sound. To demonstrate this, we instantiate the commonly used theories of equality and linear integer arithmetic in our VM with theory-specific optimizations for proving them in ZK. ZKSMT achieves high practicality even when running on realistic SMT formulas generated by Boogie, a common tool for software verification. It achieves a three-order-of-magnitude improvement compared to a baseline that executes the proof verification code in a general ZK system.

3:40 PM Compiling to transformers Joey Velez-Ginorio
University of Pennsylvania
Abstract

We still can't reliably predict or control the behavior of artificial neural networks (ANNs). To address this, recent work in mechanistic interpretability argues we must reverse-engineer the algorithms latent in ANNs; like reverse-engineering the binary of a program. These algorithms would provide a higher level of abstraction from which to predict and control ANNs. But this demands a programming language in which these algorithms could be written, whose programs compile to ANNs. We propose Drift, a typed functional programming language whose programs compile to a class of ANNs called transformers. By a differential logical relation we prove that Drift types ensure programs compile correctly to transformers. We additionally provide an implementation of Drift, using property-based testing to check that its behavior is consistent with our metatheory. These results mark the first step in a venture between compiler correctness and mechanistic interpretability---identifying what it means and how to prove that a neural network implements a program.

4:00 PM Break
4:20 PM Gradual type migration is safe, with both SMT and constraints Eric Wang
Stevens Institute of Technology
Abstract

Gradual type migration is a technique that allows programmers to progressively harness the advantages of static typing by adding type annotation to their code written in a dynamic typed language while still retaining some flexibility of dynamic typing. However, such a migration process typically lacks automation, with existing tools usually have limited language feature support and rely on solvers.

In this talk, we introduce an algorithm founded on standard type inference to facilitate automated gradual type migration, that is, taking a term in untyped lambda calculus we can produce an term in GTLC with as much static type as possible. Unlike previous approaches, our algorithm does not rely on solver and can easily be extended to richer language features such as polymorphism. Moreover, we argue that type error during inference could be interpreted as the dynamic type in the gradual type system. Furthermore, we prove the safety property about gradual type migration, which has not been well-studied before, by demonstrating the migration result of our algorithm will not introduce new dynamic error, while the proof technique is universal and applies to previous works.

4:40 PM Outcome Logic: Unified Reasoning for Effects and Branching Noam Zilberstein
Cornell University
Abstract

The logical foundations required for modern program analysis have evolved past the foundations developed over the last six decades. Programs in the wild display a variety of computational effects such as nondeterminism, randomization, and exceptions. Orthogonally—due in part to the scale of industrial codebases and in part to the fact that many programs just are not correct—bug-finding is often more tangible than full functional correctness (the target of traditional program analysis).

To handle the challenges of effects and incorrectness reasoning, specialized logical techniques have been introduced. While successful, the disjoint nature of those techniques means that wholly new static analysis systems must be built for each kind of program and program property that we wish to analyze. In this talk, I will present Outcome Logic, a new program logic that uses an algebraic interpretation of choice to encode many different kinds of computation with a single, complete proof theory. Through examples, we will see how Outcome Logic facilitates the reuse of proof fragments, paving the way for unified static analysis engines.

5:00 PM Closing Remarks