521 lines
20 KiB
Markdown
521 lines
20 KiB
Markdown
# Generating well-typed random terms using constraint-based type inference
|
|
|
|
## Introduction
|
|
|
|
The goal of this programming project is to reuse a constraint-based
|
|
type inference engine (for a simple lamda-calculus) to implement
|
|
a random generator of well-typed term.
|
|
|
|
Contact: gabriel.scherer@inria.fr .
|
|
|
|
### Motivation
|
|
|
|
Random generation of well-typed term has been studied as a research
|
|
problem: if you want to apply random fuzzing testing techniques on the
|
|
implementation of a programming language, you need to generate a lot
|
|
of programs in this language. Generating programs that parse correctly
|
|
is not too hard, but generating well-typed programs can be hard -- and
|
|
generating ill-typed program does not test the rest of the
|
|
implementation very well.
|
|
|
|
To generate well-typed terms, a common approach is to write a random
|
|
generator that "inverts" the rules of the type-system, from the root
|
|
of the typing derivation to the leaves. If the generator wants to
|
|
generate a term at a given type, it can list the typing rule that may
|
|
produce this output type, decide to apply one of them, which
|
|
corresponds to choosing the head term-former of the generated program,
|
|
and in turn requires generating subterms whose types are given by the
|
|
chosen typing rule. For example, if you want to generate the program
|
|
(? : A -> B), choosing the typing rule for lambda-abstraction refines
|
|
it into (lambda (x : A). (? : B)) with a smaller missing hole. If a type in the derivation cannot be
|
|
|
|
An example of this brand of work is the following:
|
|
|
|
Michał H. Pałka, Koen Claessen, Alejandro Russo, and John Hughes
|
|
Testing an Optimising Compiler by Generating Random Lambda Terms
|
|
2012
|
|
https://publications.lib.chalmers.se/records/fulltext/157525.pdf
|
|
|
|
This approach (which basically corresponds to a Prolog-style
|
|
proof search) is easy for simple type systems, and rapidly becomes
|
|
difficult for advanced type systems -- for example, dealing with
|
|
polymorphism is either unsatisfying or very complex. It is frustrating
|
|
to spend a lot of effort writing a complex generator, because we
|
|
(language implementers) are already spending a lot of effort writing
|
|
complex type-checkers, and it feels like a duplication of work on
|
|
similar problems.
|
|
|
|
Is it possible to write a type-checker once, and reuse it for random
|
|
generation of well-typed programs?
|
|
|
|
### The project
|
|
|
|
In this project, you will implement a *simple* constraint-based type
|
|
inference engine, basically a miniature version of Inferno
|
|
( https://inria.hal.science/hal-01081233 ,
|
|
https://gitlab.inria.fr/fpottier/inferno ) for a small simply-typed
|
|
lambda-calculus (no ML-style polymorphism), and then turn it into
|
|
a random generator of well-typed programs.
|
|
|
|
We provide you with a skeleton for the project, with most of the
|
|
boring stuff already implemented, so that you can focus on the
|
|
interesting parts. We also provide code to help you test your
|
|
code.
|
|
|
|
### Grading
|
|
|
|
The project will be evaluated by:
|
|
|
|
- checking that the code you provided is correct, by reviewing the
|
|
code and your testsuite results
|
|
|
|
- evaluating the quality of the code (clarity, documentation, etc.)
|
|
|
|
- evaluating the coverage of the "mandatory" tasks suggested
|
|
by the provided skeleton
|
|
|
|
- evaluating the difficulty and originality of the extensions you
|
|
implemented, if any
|
|
|
|
We would like you to write a short REPORT.md file at the root of the
|
|
project, which details what you did and explains any non-obvious point,
|
|
with pointers to relevant source files. There is no length requirement
|
|
for this REPORT.md file, just include the information that *you* think
|
|
is valuable and useful -- please, no boring intro or ChatGPT prose.
|
|
|
|
### Code reuse and plagiarism
|
|
|
|
Reusing third-party code is allowed, as long as (1) the code license
|
|
allows this form of reuse, and (2) you carefully indicate which parts
|
|
of the code are not yours -- mark it clearly in the code itself, and
|
|
mention it explicitly in your REPORT.md file.
|
|
|
|
In particular, please feel free to introduce dependencies on
|
|
third-party (free software) packages as long as they are explicit in
|
|
the packaging metadata of your project -- so that I can still build
|
|
the project.
|
|
|
|
Including code that comes from someone else without proper credit to
|
|
the authors is plagiarism.
|
|
|
|
### OCaml development setup
|
|
|
|
Install [Opam](https://opam.ocaml.org/doc/Install.html), the OCaml
|
|
package manager, on your system.
|
|
|
|
If you have never used Opam before, you need to initialize it (otherwise, skip this step):
|
|
|
|
```
|
|
$ opam init
|
|
```
|
|
|
|
For convenience, we setup a [local](https://opam.ocaml.org/blog/opam-local-switches/) Opam distribution, using the following commands:
|
|
|
|
```
|
|
$ opam switch create . --deps-only --with-doc --with-test
|
|
$ eval $(opam env)
|
|
```
|
|
|
|
To configure your favorite text editor, see the [Real World OCaml setup](http://dev.realworldocaml.org/install.html#editor-setup).
|
|
|
|
#### Tip on GADTs in OCaml
|
|
|
|
The type of constraints is a GADT, a Generalized (or Guarded)
|
|
Algebraic Datatype. Writing functions that operate on these datatypes
|
|
requires a bit more ceremony than everyday OCaml code, due to their
|
|
inherent use of polymorphic recursion.
|
|
|
|
If you are unfamiliar with function declarations that look like
|
|
|
|
let rec foo : type a e . (a, e) t -> ...
|
|
|
|
then you should read the GADT chapter of the OCaml manual:
|
|
https://v2.ocaml.org/releases/5.1/htmlman/gadts-tutorial.html
|
|
|
|
### Using a different language
|
|
|
|
We wrote useful support code in OCaml, so it is going to be much
|
|
easier to implement the project in OCaml. If you insist, you are of
|
|
course free to implement the project in a different programming
|
|
language -- any language, as long as I can install a free software
|
|
implementation of it on my Linux machine to test your code.
|
|
|
|
You would still need to read OCaml code to understand the key
|
|
ingredients of the projet -- type inference constraints with
|
|
elaboration, and random generation -- and transpose them in your
|
|
chosen implementation language.
|
|
|
|
Note: we do *not* expect you to reimplement the full project skeleton
|
|
as-is (see more details on what we provide in the "Code organization"
|
|
section below). Feel free to provide the minimum amount of support
|
|
code to demonstrate/test the interesting bits of the project --
|
|
constraint generation with elaboration to an explicitly-typed
|
|
representation, constraint solving, and then random generation of
|
|
well-typed terms.
|
|
|
|
## High-level description
|
|
|
|
This project contains a simple type inference engine in the spirit of
|
|
Inferno, with a type `Untyped.term` of untyped terms, a type
|
|
`STLC.term` of explicitly-typed terms, a type `('a, 'e) Constraint.t`
|
|
of constraints that produce elaboration witnesses of type `'a`.
|
|
|
|
#### Type inference a la inferno
|
|
|
|
The general idea is to implement a constraint generator of type
|
|
|
|
Untyped.term -> (STLC.term, type_error) Constraint.t
|
|
|
|
and a constraint solving function of type
|
|
|
|
val eval : ('a, 'e) Constraint.t -> ('a, 'e) result
|
|
|
|
which extracts a result (success or failure) from a normal form
|
|
constraint.
|
|
|
|
By composing these functions together, you have a type-checker for
|
|
the untyped language, that produces a "witness of well-typedness" in
|
|
the form of an explicitly-typed term -- presumably an annotation of
|
|
the original program.
|
|
|
|
(To keep the project difficulty manageable, our "simple type inference
|
|
engine" does not handle ML-style polymorphism, or in fact any sort of
|
|
polymorphism. We are implementing type inference for the simply-typed
|
|
lambda-calculus.)
|
|
|
|
#### Abstracting over an effect
|
|
|
|
But then there is a twist: we add to the language of untyped terms
|
|
*and* to the language of constraint a `Do` constructor that represents
|
|
a term (or constraint) produced by an "arbitrary effect", where the
|
|
notion of effect is given by an arbitrary functor (a parametrized type
|
|
with a `map` function). This is implemented by writing all the code
|
|
in OCaml modules `Make(T : Utils.Functor)` parametrized over `T`.
|
|
|
|
The constraint-generation function is unchanged.
|
|
|
|
Untyped.term -> (STLC.term, type_error) Constraint.t
|
|
|
|
For constraint solving, however, new terms of the form
|
|
`Do (p : (a, e) Constraint.t T.t)`
|
|
now have to be evaluated. We propose to extend the `eval` function to
|
|
a richer type
|
|
|
|
val eval : ('a, 'e) Constraint.t -> ('a, 'e) normal_constraint
|
|
|
|
where a normal constraint is either a success, a failure, or an effectful
|
|
constraint computation `('a, 'e) Constraint.t T.t`.
|
|
|
|
We propose an evaluation rule of the form
|
|
|
|
eval E[Do p] = NDo E[p]
|
|
|
|
where E is an evalution context for constraints, `p` has type
|
|
`('a, 'e) Constraint.t T.t` (it is a computation in `T` that
|
|
returns constraints), and `E[p]` is defined by lifting the
|
|
context-surrounding function
|
|
`E[_] : ('a, 'e) Constraint.t -> ('b, 'f) Constraint.t`
|
|
through the `T` functor.
|
|
|
|
#### Two or three effect instances
|
|
|
|
An obvious instantion of this `T : Functor` parameter is to use the
|
|
functor `Id.Empty` of empty (parametrized) types with no
|
|
inhabitant. This corresponds to the case where the `Do` constructor
|
|
cannot be used, the terms and constraint are pure. In this case `eval`
|
|
will simply evaluate the constraint to a result. This is what the
|
|
`minihell` test program uses.
|
|
|
|
The other case of interest for this is when the parameter
|
|
`T : Utils.Functor` is in fact a search monad `M : Utils.MonadPlus`.
|
|
Then it is possible to define a function
|
|
|
|
val gen : depth:int -> ('a, 'e) constraint -> ('a, 'e) result M.t
|
|
|
|
on top of `eval`, that returns all the results that can be reached by
|
|
expanding `Do` nodes using `M.bind`, recursively, exactly `depth`
|
|
times. (Another natural choice would be to generate all the terms that
|
|
can be reached by expanding `Do` nodes *at most* `depth` times, but
|
|
this typically gives a worse generator.)
|
|
|
|
Finally, to get an actual program generator, you need to instantiate
|
|
this machinery with a certain choice of `M : MonadPlus` structure. We
|
|
ask you to implement two natural variants:
|
|
|
|
- `MSeq`, which simply enumerates the finite lists of possible
|
|
results.
|
|
|
|
- `MRand`, which returns random solutions -- an infinite stream of
|
|
independent randomly-sampled solutions.
|
|
|
|
|
|
## Implementation tasks
|
|
|
|
In short: implement the missing pieces to get the code working,
|
|
following the high-level description above. Then (optionally) think
|
|
about extending the project further -- we give some ideas in the
|
|
"Extensions" section later on.
|
|
|
|
In more details, we recommend the following progression (but you do as
|
|
you prefer).
|
|
|
|
0. Creating a version-controlled repository for your project, for
|
|
example by running `git init` in the root directory of the project.
|
|
Even if you program all alone, version control is a must-have for
|
|
any moderately complex project. Whenever you have achieved
|
|
progress, save/commit the current state. Whenever you are able to
|
|
backtrack, erase your last hour of work and try to something
|
|
different, save/commit the current state *before* throwing it
|
|
away. You are welcome.
|
|
|
|
1. Look at the simply-typed lambda-calculus used as a toy programming
|
|
language, implemented in Untyped.ml (the before-type-inference form
|
|
without types everywhere) and STLC.ml (the explicitly-checked
|
|
version produced by our type inference engine). These are fairly
|
|
standard.
|
|
|
|
2. Look at the datatype of constraints used for type inference,
|
|
defined in Constraint.ml.
|
|
|
|
3. Read the testsuite in tests.t/run.t to have an idea of the
|
|
sort of behaviors expected once the project is complete.
|
|
|
|
Reading the testsuite will also give you ideas on how to run your
|
|
own tests during your work. Feel free to edit the run.t file, or
|
|
add more test files.
|
|
|
|
You might want to save the run.t file somewhere for future
|
|
reference. Then we recommend "promoting" the current output
|
|
(where nothing works because nothing is implemented yet) by running
|
|
`dune runtest` (at the root of the project) and then
|
|
`dune promote`. As you implement more features you should regularly
|
|
run `dune runtest` again, and `dune promote` to save the current
|
|
testsuite output. This will help you track your progress and
|
|
notice behavior changes (improvements or regressions).
|
|
|
|
Note: the testsuite outputs are descriptive, not prescriptive. If
|
|
you implement the project differently, you may get different
|
|
outputs that are also correct. You have to read the test output to
|
|
tell if they make sense.
|
|
|
|
4. Implement a constraint generator in Infer.ml.
|
|
|
|
You should be able to test your constraint generator by running
|
|
commands such as:
|
|
|
|
dune exec -- minihell --show-constraint tests/id_poly.test
|
|
|
|
(Please of course feel free to write your own test files, and
|
|
consider adding them to the testsuite in tests/run.t.)
|
|
|
|
5. Implement a constraint solver in Solver.ml. (There is also
|
|
a missing function, Structure.merge, that you will have to
|
|
implement to get unification working.)
|
|
|
|
You should be able to test your constraint solver by running
|
|
commands such as
|
|
|
|
dune exec -- minihell --show-constraint --log-solver tests/id_poly.test
|
|
|
|
At this point you have a complete type-checker for the simply-typed
|
|
lambda-calculus. Congratulations!
|
|
|
|
6. Implement search monads in MSeq.ml and MRand.ml. They should obey
|
|
the MonadPlus interface defined in Utils.ml. The intention is that:
|
|
|
|
- MSeq should simply generate the finite sequence of all terms of
|
|
a given size (the order does not matter), while
|
|
- MRand should return an infinite sequence, each element being
|
|
a random choice of term of the given size (we do not expect the
|
|
distribution to be uniform: typically, for each syntactic node,
|
|
each term constructor can be picked with a fixed probability).
|
|
|
|
You can implement just one of them before going to the next step,
|
|
or both. You only need one to test your code at first.
|
|
|
|
6. Implement a term generator in Generator.ml.
|
|
|
|
You can test the generator instantiated with MRand by running, for
|
|
example:
|
|
|
|
dune exec -- minigen --depth 5 --count 3
|
|
|
|
You can test the generator instantiated with MSeq:
|
|
example:
|
|
|
|
dune exec -- minigen --exhaustive --depth 5 --count 20
|
|
|
|
(On my implementation this only generates 10 terms, as the
|
|
generator only produces 10 different terms at this depth.)
|
|
|
|
7. At this point you are basically done with the "mandatory" part of
|
|
this project. Please ensure that your code is clean and readable,
|
|
we will take this into account when grading.
|
|
|
|
You should now think of implementing extensions of your choice.
|
|
(See the "Extensions" section below for some potential ideas.)
|
|
|
|
|
|
## Skeleton code organization
|
|
|
|
- `tests.t`: the testsuite. See tests.t/run.t for details.
|
|
This is important.
|
|
|
|
- `bin/`: two small executable programs that are used to
|
|
test the main logic in `src/`:
|
|
|
|
+ `minihell`: a toy type-checker
|
|
|
|
+ `minigen`: a toy generator of well-typed terms
|
|
|
|
You should feel free to modify these programs if you wish, but you
|
|
do not need to.
|
|
|
|
- `src/`: the bulk of the code, that place where we expect you to
|
|
work. `src/*.{ml,mli}` are the more important modules that you will
|
|
have to modify or use directly. `src/support/*.{ml,mli}` has the
|
|
less interesting support code. (Again, feel free to modify the
|
|
support code, but you do not need to.)
|
|
|
|
+ `Generator.ml,mli`: the random term generator (you only need to
|
|
look at this late in the project, feel free to skip it at first)
|
|
|
|
+ `Infer.ml,mli`: the type-inference engine, that generates constraints
|
|
with elaboration
|
|
|
|
+ `MRand.ml,mli`: the random-sampling monad
|
|
|
|
+ `MSeq.ml,mli`: the list-all-solutions monad
|
|
|
|
+ `Solver.ml,mli`: the constraint solver
|
|
|
|
+ `STLC.ml`: the syntax of types and well-typed terms
|
|
|
|
+ `Structure.ml`: the definition of type-formers
|
|
in the type system. This is used in `STLC.ml`,
|
|
but also in constraints that manipulate types
|
|
containing inference variables.
|
|
|
|
+ `Unif.ml,mli`: the unification engine. This is fully
|
|
implemented. You will need to understand its interface
|
|
(`Unif.mli`) to solve equality constraints in `Solver.ml`.
|
|
|
|
+ `Untyped.ml`: the syntax of untyped terms.
|
|
|
|
+ `Utils.ml`: useful bits and pieces. Feel free
|
|
to add your stuff there.
|
|
|
|
+ `support/`: the boring modules that help for debugging and
|
|
testing, but you probably do not need to use or touch them
|
|
directly. (Feel free to modify stuff there if you want.)
|
|
|
|
* `ConstraintPrinter.ml,mli`: a pretty-printer for constraints
|
|
|
|
* `ConstraintSimplifier.ml,mli`: a "simplifier" for constraints,
|
|
that is used to implement the `--log-solver` option of
|
|
`minihell`.
|
|
|
|
* `Decode.ml,mli`: reconstructs user-facing types from
|
|
the current state of the unification engine.
|
|
|
|
* `Printer.ml`: support code for all pretty-printers.
|
|
|
|
* `SatConstraint.ml`: "satisfiable constraints" are constraints
|
|
that do not produce an output (no elaboration to
|
|
well-typed terms). This simpler type (no GADTs in sight) is
|
|
used under the hood for simplification and pretty-printing.
|
|
|
|
* `STLCPrinter.ml,mli`: a pretty-printer for explicitly-typed
|
|
terms.
|
|
|
|
* `UntypedLexer.mll`: an `ocamllex` lexer for the untyped
|
|
language.
|
|
|
|
* `UntypedParser.mly`: a `menhir` grammar for the untyped
|
|
language.
|
|
|
|
* `UntypedPrinter.ml,mli`: a pretty-printer for the untyped
|
|
language.
|
|
|
|
(Again: if you decide to take risks and implement the project in
|
|
a different language, you do not need to reimplement any of that,
|
|
but then you are on your own for testing and debugging.)
|
|
|
|
|
|
### Extensions
|
|
|
|
After you are done with the core/mandatory part of the project, we
|
|
expect you to implement extensions of your choosing. Surprise us!
|
|
|
|
In case it can serve as inspiration, here are a few ideas for
|
|
extensions.
|
|
|
|
- You could add some extra features to the programming language
|
|
supported. For example:
|
|
|
|
+ Base types such as integers and booleans
|
|
|
|
+ Recursive function definitions.
|
|
|
|
+ Lists, with a built-in shallow pattern-matching construction
|
|
|
|
match .. with
|
|
| [] -> ...
|
|
| x :: xs -> ...
|
|
|
|
+ n-ary tuples: this sounds easy but it is in fact quite tricky,
|
|
because the strongly-typed GADT of constraints does not make it
|
|
easy to generate a "dynamic" number of existential
|
|
quantifiers. This is a difficult exercise in strongly-typed
|
|
programming.
|
|
|
|
(We in fact wrote a short paper on how we approached this problem
|
|
in the context of Inferno, and the difficulties are the same.
|
|
https://inria.hal.science/hal-03145040
|
|
Feel free to reuse our approach if you want.)
|
|
|
|
+ Patterns and pattern-matching. You get into the same difficulties
|
|
as n-ary tuples, so working on n-ary tuples first is probably
|
|
a good idea.
|
|
|
|
+ You could implement ML-style parametric polymorphism, with
|
|
generalization during type inference. This is the hardest of the
|
|
extensions mentioned here by far: we have not tried this, and we
|
|
do not know how the interaction between enumeration/generation
|
|
and generalization will work out. (One might need to change the
|
|
term generation to stop going 'in and out' of sub-constraints, or
|
|
at least to be careful in the implementation of ML-style
|
|
generalization to use a (semi-)persistent data structure, a tree
|
|
of generalization regions instead of a stack.)
|
|
|
|
- You could use the term generator for property-based testing of
|
|
*something*: implement a procedure that manipulates explicitly-typed
|
|
terms -- for example, a semantics-preserving program transformation,
|
|
or a compilation to the programming language of your choice, etc --
|
|
and test the property using random generator.
|
|
|
|
(You could probably do some greybox fuzzing by providing an
|
|
alternative search monad using Crowbar:
|
|
https://github.com/stedolan/crowbar )
|
|
|
|
- You could improve the random-generation strategy by thinking about
|
|
the shape of terms that you want to generate, tweaking the
|
|
distribution, providing options to generate terms using only
|
|
a subset of the rules, etc.
|
|
|
|
An important improvement in practice would be to provide some form
|
|
of shrinking.
|
|
|
|
- You could look at existing work on testing of language
|
|
implementations by random generation, and see if they can be
|
|
integrated in the present framework. For example,
|
|
|
|
https://janmidtgaard.dk/papers/Midtgaard-al%3aICFP17-full.pdf
|
|
|
|
performs random generation of programs with a custom effect system,
|
|
that determines if a given sub-expression performs an observable
|
|
side-effect or not -- to only generate terms with a deterministic
|
|
evaluation order. Could one extend the inference engine to also
|
|
check that the program is deterministic in this sense, and thus
|
|
replace the custom-built program generator used in this work?
|