Gabriel Scherer c0ad56a745 move Decode from support/ to src/ for more visibility
Suggested-by: Neven Villani <neven.villani@crans.org>
2024-01-23 23:33:23 +01:00

521 lines
20 KiB
Markdown

# Generating well-typed random terms using constraint-based type inference
## Introduction
The goal of this programming project is to reuse a constraint-based
type inference engine (for a simple lamda-calculus) to implement
a random generator of well-typed term.
Contact: gabriel.scherer@inria.fr .
### Motivation
Random generation of well-typed term has been studied as a research
problem: if you want to apply random fuzzing testing techniques on the
implementation of a programming language, you need to generate a lot
of programs in this language. Generating programs that parse correctly
is not too hard, but generating well-typed programs can be hard -- and
generating ill-typed program does not test the rest of the
implementation very well.
To generate well-typed terms, a common approach is to write a random
generator that "inverts" the rules of the type-system, from the root
of the typing derivation to the leaves. If the generator wants to
generate a term at a given type, it can list the typing rule that may
produce this output type, decide to apply one of them, which
corresponds to choosing the head term-former of the generated program,
and in turn requires generating subterms whose types are given by the
chosen typing rule. For example, if you want to generate the program
(? : A -> B), choosing the typing rule for lambda-abstraction refines
it into (lambda (x : A). (? : B)) with a smaller missing hole. If a type in the derivation cannot be
An example of this brand of work is the following:
Michał H. Pałka, Koen Claessen, Alejandro Russo, and John Hughes
Testing an Optimising Compiler by Generating Random Lambda Terms
2012
https://publications.lib.chalmers.se/records/fulltext/157525.pdf
This approach (which basically corresponds to a Prolog-style
proof search) is easy for simple type systems, and rapidly becomes
difficult for advanced type systems -- for example, dealing with
polymorphism is either unsatisfying or very complex. It is frustrating
to spend a lot of effort writing a complex generator, because we
(language implementers) are already spending a lot of effort writing
complex type-checkers, and it feels like a duplication of work on
similar problems.
Is it possible to write a type-checker once, and reuse it for random
generation of well-typed programs?
### The project
In this project, you will implement a *simple* constraint-based type
inference engine, basically a miniature version of Inferno
( https://inria.hal.science/hal-01081233 ,
https://gitlab.inria.fr/fpottier/inferno ) for a small simply-typed
lambda-calculus (no ML-style polymorphism), and then turn it into
a random generator of well-typed programs.
We provide you with a skeleton for the project, with most of the
boring stuff already implemented, so that you can focus on the
interesting parts. We also provide code to help you test your
code.
### Grading
The project will be evaluated by:
- checking that the code you provided is correct, by reviewing the
code and your testsuite results
- evaluating the quality of the code (clarity, documentation, etc.)
- evaluating the coverage of the "mandatory" tasks suggested
by the provided skeleton
- evaluating the difficulty and originality of the extensions you
implemented, if any
We would like you to write a short REPORT.md file at the root of the
project, which details what you did and explains any non-obvious point,
with pointers to relevant source files. There is no length requirement
for this REPORT.md file, just include the information that *you* think
is valuable and useful -- please, no boring intro or ChatGPT prose.
### Code reuse and plagiarism
Reusing third-party code is allowed, as long as (1) the code license
allows this form of reuse, and (2) you carefully indicate which parts
of the code are not yours -- mark it clearly in the code itself, and
mention it explicitly in your REPORT.md file.
In particular, please feel free to introduce dependencies on
third-party (free software) packages as long as they are explicit in
the packaging metadata of your project -- so that I can still build
the project.
Including code that comes from someone else without proper credit to
the authors is plagiarism.
### OCaml development setup
Install [Opam](https://opam.ocaml.org/doc/Install.html), the OCaml
package manager, on your system.
If you have never used Opam before, you need to initialize it (otherwise, skip this step):
```
$ opam init
```
For convenience, we setup a [local](https://opam.ocaml.org/blog/opam-local-switches/) Opam distribution, using the following commands:
```
$ opam switch create . --deps-only --with-doc --with-test
$ eval $(opam env)
```
To configure your favorite text editor, see the [Real World OCaml setup](http://dev.realworldocaml.org/install.html#editor-setup).
#### Tip on GADTs in OCaml
The type of constraints is a GADT, a Generalized (or Guarded)
Algebraic Datatype. Writing functions that operate on these datatypes
requires a bit more ceremony than everyday OCaml code, due to their
inherent use of polymorphic recursion.
If you are unfamiliar with function declarations that look like
let rec foo : type a e . (a, e) t -> ...
then you should read the GADT chapter of the OCaml manual:
https://v2.ocaml.org/releases/5.1/htmlman/gadts-tutorial.html
### Using a different language
We wrote useful support code in OCaml, so it is going to be much
easier to implement the project in OCaml. If you insist, you are of
course free to implement the project in a different programming
language -- any language, as long as I can install a free software
implementation of it on my Linux machine to test your code.
You would still need to read OCaml code to understand the key
ingredients of the projet -- type inference constraints with
elaboration, and random generation -- and transpose them in your
chosen implementation language.
Note: we do *not* expect you to reimplement the full project skeleton
as-is (see more details on what we provide in the "Code organization"
section below). Feel free to provide the minimum amount of support
code to demonstrate/test the interesting bits of the project --
constraint generation with elaboration to an explicitly-typed
representation, constraint solving, and then random generation of
well-typed terms.
## High-level description
This project contains a simple type inference engine in the spirit of
Inferno, with a type `Untyped.term` of untyped terms, a type
`STLC.term` of explicitly-typed terms, a type `('a, 'e) Constraint.t`
of constraints that produce elaboration witnesses of type `'a`.
#### Type inference a la inferno
The general idea is to implement a constraint generator of type
Untyped.term -> (STLC.term, type_error) Constraint.t
and a constraint solving function of type
val eval : ('a, 'e) Constraint.t -> ('a, 'e) result
which extracts a result (success or failure) from a normal form
constraint.
By composing these functions together, you have a type-checker for
the untyped language, that produces a "witness of well-typedness" in
the form of an explicitly-typed term -- presumably an annotation of
the original program.
(To keep the project difficulty manageable, our "simple type inference
engine" does not handle ML-style polymorphism, or in fact any sort of
polymorphism. We are implementing type inference for the simply-typed
lambda-calculus.)
#### Abstracting over an effect
But then there is a twist: we add to the language of untyped terms
*and* to the language of constraint a `Do` constructor that represents
a term (or constraint) produced by an "arbitrary effect", where the
notion of effect is given by an arbitrary functor (a parametrized type
with a `map` function). This is implemented by writing all the code
in OCaml modules `Make(T : Utils.Functor)` parametrized over `T`.
The constraint-generation function is unchanged.
Untyped.term -> (STLC.term, type_error) Constraint.t
For constraint solving, however, new terms of the form
`Do (p : (a, e) Constraint.t T.t)`
now have to be evaluated. We propose to extend the `eval` function to
a richer type
val eval : ('a, 'e) Constraint.t -> ('a, 'e) normal_constraint
where a normal constraint is either a success, a failure, or an effectful
constraint computation `('a, 'e) Constraint.t T.t`.
We propose an evaluation rule of the form
eval E[Do p] = NDo E[p]
where E is an evalution context for constraints, `p` has type
`('a, 'e) Constraint.t T.t` (it is a computation in `T` that
returns constraints), and `E[p]` is defined by lifting the
context-surrounding function
`E[_] : ('a, 'e) Constraint.t -> ('b, 'f) Constraint.t`
through the `T` functor.
#### Two or three effect instances
An obvious instantion of this `T : Functor` parameter is to use the
functor `Id.Empty` of empty (parametrized) types with no
inhabitant. This corresponds to the case where the `Do` constructor
cannot be used, the terms and constraint are pure. In this case `eval`
will simply evaluate the constraint to a result. This is what the
`minihell` test program uses.
The other case of interest for this is when the parameter
`T : Utils.Functor` is in fact a search monad `M : Utils.MonadPlus`.
Then it is possible to define a function
val gen : depth:int -> ('a, 'e) constraint -> ('a, 'e) result M.t
on top of `eval`, that returns all the results that can be reached by
expanding `Do` nodes using `M.bind`, recursively, exactly `depth`
times. (Another natural choice would be to generate all the terms that
can be reached by expanding `Do` nodes *at most* `depth` times, but
this typically gives a worse generator.)
Finally, to get an actual program generator, you need to instantiate
this machinery with a certain choice of `M : MonadPlus` structure. We
ask you to implement two natural variants:
- `MSeq`, which simply enumerates the finite lists of possible
results.
- `MRand`, which returns random solutions -- an infinite stream of
independent randomly-sampled solutions.
## Implementation tasks
In short: implement the missing pieces to get the code working,
following the high-level description above. Then (optionally) think
about extending the project further -- we give some ideas in the
"Extensions" section later on.
In more details, we recommend the following progression (but you do as
you prefer).
0. Creating a version-controlled repository for your project, for
example by running `git init` in the root directory of the project.
Even if you program all alone, version control is a must-have for
any moderately complex project. Whenever you have achieved
progress, save/commit the current state. Whenever you are able to
backtrack, erase your last hour of work and try to something
different, save/commit the current state *before* throwing it
away. You are welcome.
1. Look at the simply-typed lambda-calculus used as a toy programming
language, implemented in Untyped.ml (the before-type-inference form
without types everywhere) and STLC.ml (the explicitly-checked
version produced by our type inference engine). These are fairly
standard.
2. Look at the datatype of constraints used for type inference,
defined in Constraint.ml.
3. Read the testsuite in tests.t/run.t to have an idea of the
sort of behaviors expected once the project is complete.
Reading the testsuite will also give you ideas on how to run your
own tests during your work. Feel free to edit the run.t file, or
add more test files.
You might want to save the run.t file somewhere for future
reference. Then we recommend "promoting" the current output
(where nothing works because nothing is implemented yet) by running
`dune runtest` (at the root of the project) and then
`dune promote`. As you implement more features you should regularly
run `dune runtest` again, and `dune promote` to save the current
testsuite output. This will help you track your progress and
notice behavior changes (improvements or regressions).
Note: the testsuite outputs are descriptive, not prescriptive. If
you implement the project differently, you may get different
outputs that are also correct. You have to read the test output to
tell if they make sense.
4. Implement a constraint generator in Infer.ml.
You should be able to test your constraint generator by running
commands such as:
dune exec -- minihell --show-constraint tests/id_poly.test
(Please of course feel free to write your own test files, and
consider adding them to the testsuite in tests/run.t.)
5. Implement a constraint solver in Solver.ml. (There is also
a missing function, Structure.merge, that you will have to
implement to get unification working.)
You should be able to test your constraint solver by running
commands such as
dune exec -- minihell --show-constraint --log-solver tests/id_poly.test
At this point you have a complete type-checker for the simply-typed
lambda-calculus. Congratulations!
6. Implement search monads in MSeq.ml and MRand.ml. They should obey
the MonadPlus interface defined in Utils.ml. The intention is that:
- MSeq should simply generate the finite sequence of all terms of
a given size (the order does not matter), while
- MRand should return an infinite sequence, each element being
a random choice of term of the given size (we do not expect the
distribution to be uniform: typically, for each syntactic node,
each term constructor can be picked with a fixed probability).
You can implement just one of them before going to the next step,
or both. You only need one to test your code at first.
6. Implement a term generator in Generator.ml.
You can test the generator instantiated with MRand by running, for
example:
dune exec -- minigen --depth 5 --count 3
You can test the generator instantiated with MSeq:
example:
dune exec -- minigen --exhaustive --depth 5 --count 20
(On my implementation this only generates 10 terms, as the
generator only produces 10 different terms at this depth.)
7. At this point you are basically done with the "mandatory" part of
this project. Please ensure that your code is clean and readable,
we will take this into account when grading.
You should now think of implementing extensions of your choice.
(See the "Extensions" section below for some potential ideas.)
## Skeleton code organization
- `tests.t`: the testsuite. See tests.t/run.t for details.
This is important.
- `bin/`: two small executable programs that are used to
test the main logic in `src/`:
+ `minihell`: a toy type-checker
+ `minigen`: a toy generator of well-typed terms
You should feel free to modify these programs if you wish, but you
do not need to.
- `src/`: the bulk of the code, that place where we expect you to
work. `src/*.{ml,mli}` are the more important modules that you will
have to modify or use directly. `src/support/*.{ml,mli}` has the
less interesting support code. (Again, feel free to modify the
support code, but you do not need to.)
+ `Generator.ml,mli`: the random term generator (you only need to
look at this late in the project, feel free to skip it at first)
+ `Infer.ml,mli`: the type-inference engine, that generates constraints
with elaboration
+ `MRand.ml,mli`: the random-sampling monad
+ `MSeq.ml,mli`: the list-all-solutions monad
+ `Solver.ml,mli`: the constraint solver
+ `STLC.ml`: the syntax of types and well-typed terms
+ `Structure.ml`: the definition of type-formers
in the type system. This is used in `STLC.ml`,
but also in constraints that manipulate types
containing inference variables.
+ `Unif.ml,mli`: the unification engine. This is fully
implemented. You will need to understand its interface
(`Unif.mli`) to solve equality constraints in `Solver.ml`.
+ `Untyped.ml`: the syntax of untyped terms.
+ `Utils.ml`: useful bits and pieces. Feel free
to add your stuff there.
+ `support/`: the boring modules that help for debugging and
testing, but you probably do not need to use or touch them
directly. (Feel free to modify stuff there if you want.)
* `ConstraintPrinter.ml,mli`: a pretty-printer for constraints
* `ConstraintSimplifier.ml,mli`: a "simplifier" for constraints,
that is used to implement the `--log-solver` option of
`minihell`.
* `Decode.ml,mli`: reconstructs user-facing types from
the current state of the unification engine.
* `Printer.ml`: support code for all pretty-printers.
* `SatConstraint.ml`: "satisfiable constraints" are constraints
that do not produce an output (no elaboration to
well-typed terms). This simpler type (no GADTs in sight) is
used under the hood for simplification and pretty-printing.
* `STLCPrinter.ml,mli`: a pretty-printer for explicitly-typed
terms.
* `UntypedLexer.mll`: an `ocamllex` lexer for the untyped
language.
* `UntypedParser.mly`: a `menhir` grammar for the untyped
language.
* `UntypedPrinter.ml,mli`: a pretty-printer for the untyped
language.
(Again: if you decide to take risks and implement the project in
a different language, you do not need to reimplement any of that,
but then you are on your own for testing and debugging.)
### Extensions
After you are done with the core/mandatory part of the project, we
expect you to implement extensions of your choosing. Surprise us!
In case it can serve as inspiration, here are a few ideas for
extensions.
- You could add some extra features to the programming language
supported. For example:
+ Base types such as integers and booleans
+ Recursive function definitions.
+ Lists, with a built-in shallow pattern-matching construction
match .. with
| [] -> ...
| x :: xs -> ...
+ n-ary tuples: this sounds easy but it is in fact quite tricky,
because the strongly-typed GADT of constraints does not make it
easy to generate a "dynamic" number of existential
quantifiers. This is a difficult exercise in strongly-typed
programming.
(We in fact wrote a short paper on how we approached this problem
in the context of Inferno, and the difficulties are the same.
https://inria.hal.science/hal-03145040
Feel free to reuse our approach if you want.)
+ Patterns and pattern-matching. You get into the same difficulties
as n-ary tuples, so working on n-ary tuples first is probably
a good idea.
+ You could implement ML-style parametric polymorphism, with
generalization during type inference. This is the hardest of the
extensions mentioned here by far: we have not tried this, and we
do not know how the interaction between enumeration/generation
and generalization will work out. (One might need to change the
term generation to stop going 'in and out' of sub-constraints, or
at least to be careful in the implementation of ML-style
generalization to use a (semi-)persistent data structure, a tree
of generalization regions instead of a stack.)
- You could use the term generator for property-based testing of
*something*: implement a procedure that manipulates explicitly-typed
terms -- for example, a semantics-preserving program transformation,
or a compilation to the programming language of your choice, etc --
and test the property using random generator.
(You could probably do some greybox fuzzing by providing an
alternative search monad using Crowbar:
https://github.com/stedolan/crowbar )
- You could improve the random-generation strategy by thinking about
the shape of terms that you want to generate, tweaking the
distribution, providing options to generate terms using only
a subset of the rules, etc.
An important improvement in practice would be to provide some form
of shrinking.
- You could look at existing work on testing of language
implementations by random generation, and see if they can be
integrated in the present framework. For example,
https://janmidtgaard.dk/papers/Midtgaard-al%3aICFP17-full.pdf
performs random generation of programs with a custom effect system,
that determines if a given sub-expression performs an observable
side-effect or not -- to only generate terms with a deterministic
evaluation order. Could one extend the inference engine to also
check that the program is deterministic in this sense, and thus
replace the custom-built program generator used in this work?