C, Lambda Calculus and Compiler Verification - a study in Haskell of purely-functional techniques for a formal specification of imperative programming languages and an epistemically-sound verification of their compilers

C, Lambda Calculus and Compiler Verification - a study in Haskell of purely-functional techniques for a formal specification of imperative programming languages and an epistemically-sound verification of their compilers

Access & Terms of Use

open access
Copyright: Zadarnowski, Patryk

CC BY-NC-ND 3.0

Abstract

Formal verification of a compiler is a long-standing problem in computer science and, although recent years have seen substantial achievements in the area, most of the proposed solutions do not scale very well with the complexity of modern software development environments. In this thesis, I present a formal semantic model of the popular C programming language described in the ANSI/ISO/IEC Standard 9899:1990, in the form of a mapping of C programs to computable functions expressed in a suitable variant of lambda calculus. The specification is formulated in a highly readable functional style and is accompanied by a complete Haskell implementation of the compiler, covering all aspects of the translation from a parse tree of a C program down to the actual sequence of executable machine instructions, resolving issues of separate compilation, allowing for optimising program transformations and providing rigorous guarantees of the implementation's conformance to a formal definition of the language. In order to achieve these goals, I revisit the challenge of compiler verification from its very philosophical foundations. Beginning with the basic epistemic notions of knowledge, correctness and proof, I show that a fully rigorous solution of the problem requires a constructive formulation of the correctness criteria in terms of the translation process itself, in contrast with the more popular extensional approaches to compiler verification, in which correctness is generally defined as commutativity of the system with respect to a formal semantics of the source and target languages, effectively formalising various aspects of the compiler independently of each other and separately from its eventual implementation. I argue that a satisfactory judgement of correctness should always constitute a direct formal description of the job performed by the software being judged, instead of an axiomatic definition of some abstract property such as commutativity of the translation system, since the later approach fails to establish a crucial causal connection between a judgement of correctness and a knowledge of it. The primary contribution of this thesis is the new notion of linear correctness, which strives to provide a constructive formulation of a compiler's validity criteria by deriving its judgement directly from a formal description of the language translation process itself. The approach relies crucially on a denotational semantic model of the source and target languages, in which the domains of program meanings are unified with the actual intermediate program representation of the underlying compiler implementation. By defining the concepts of a programming language, compiler and compiler correctness in category theoretic terms, I show that every linearly correct compiler is also valid in the more traditional sense of the word. Further, by presenting a complete verified translation of the standard C programming language within this framework, I demonstrate that linear correctness is a highly effective approach to the problem of compiler verification and that it scales particularly well with the complexity of modern software development environments.

Persistent link to this record

http://hdl.handle.net/1959.4/50242

DOI

https://doi.org/10.26190/unsworks/23423

Author(s)

Zadarnowski, Patryk

Supervisor(s)

Chakravarty, Manuel M.T.

Publication Year

2011

Resource Type

Thesis

Degree Type

PhD Doctorate

UNSW Faculty

Files

2.69 MB

Adobe Portable Document Format

View full record Show statistics