Example: The Well-Typed Interpreter
In this section, we’ll use the features we’ve seen so far to write a
larger example, an interpreter for a simple functional programming
language, with variables, function application, binary operators and
an if...then...else
construct. We will use the dependent type
system to ensure that any programs which can be represented are
well-typed.
Representing Languages
First, let us define the types in the language. We have integers,
booleans, and functions, represented by Ty
:
data Ty = TyInt | TyBool | TyFun Ty Ty
We can write a function to translate these representations to a concrete Idris type — remember that types are first class, so can be calculated just like any other value:
interpTy : Ty -> Type
interpTy TyInt = Integer
interpTy TyBool = Bool
interpTy (TyFun a t) = interpTy a -> interpTy t
We’re going to define a representation of our language in such a way
that only well-typed programs can be represented. We’ll index the
representations of expressions by their type, and the types of
local variables (the context). The context can be represented using
the Vect
data type, so we’ll need to import Data.Vect
at the top of our
source file:
import Data.Vect
Expressions are indexed by the types of the local variables, and the type of the expression itself:
data Expr : Vect n Ty -> Ty -> Type
The full representation of expressions is:
data HasType : (i : Fin n) -> Vect n Ty -> Ty -> Type where
Stop : HasType FZ (t :: ctxt) t
Pop : HasType k ctxt t -> HasType (FS k) (u :: ctxt) t
data Expr : Vect n Ty -> Ty -> Type where
Var : HasType i ctxt t -> Expr ctxt t
Val : (x : Integer) -> Expr ctxt TyInt
Lam : Expr (a :: ctxt) t -> Expr ctxt (TyFun a t)
App : Expr ctxt (TyFun a t) -> Expr ctxt a -> Expr ctxt t
Op : (interpTy a -> interpTy b -> interpTy c) ->
Expr ctxt a -> Expr ctxt b -> Expr ctxt c
If : Expr ctxt TyBool ->
Lazy (Expr ctxt a) ->
Lazy (Expr ctxt a) -> Expr ctxt a
The code above makes use of the Vect
and Fin
types from the
base libraries. Fin
is available as part of Data.Vect
. Throughout,
ctxt
refers to the local variable context.
Since expressions are indexed by their type, we can read the typing rules of the language from the definitions of the constructors. Let us look at each constructor in turn.
We use a nameless representation for variables — they are de Bruijn
indexed. Variables are represented by a proof of their membership in
the context, HasType i ctxt T
, which is a proof that variable i
in context ctxt
has type T
. This is defined as follows:
data HasType : (i : Fin n) -> Vect n Ty -> Ty -> Type where
Stop : HasType FZ (t :: ctxt) t
Pop : HasType k ctxt t -> HasType (FS k) (u :: ctxt) t
We can treat Stop as a proof that the most recently defined variable
is well-typed, and Pop n as a proof that, if the n
th most
recently defined variable is well-typed, so is the n+1
th. In
practice, this means we use Stop
to refer to the most recently
defined variable, Pop Stop
to refer to the next, and so on, via
the Var
constructor:
Var : HasType i ctxt t -> Expr ctxt t
So, in an expression \x. \y. x y
, the variable x
would have a
de Bruijn index of 1, represented as Pop Stop
, and y 0
,
represented as Stop
. We find these by counting the number of
lambdas between the definition and the use.
A value carries a concrete representation of an integer:
Val : (x : Integer) -> Expr ctxt TyInt
A lambda creates a function. In the scope of a function of type a ->
t
, there is a new local variable of type a
, which is expressed
by the context index:
Lam : Expr (a :: ctxt) t -> Expr ctxt (TyFun a t)
Function application produces a value of type t
given a function
from a
to t
and a value of type a
:
App : Expr ctxt (TyFun a t) -> Expr ctxt a -> Expr ctxt t
We allow arbitrary binary operators, where the type of the operator informs what the types of the arguments must be:
Op : (interpTy a -> interpTy b -> interpTy c) ->
Expr ctxt a -> Expr ctxt b -> Expr ctxt c
Finally, If
expressions make a choice given a boolean. Each branch
must have the same type, and we will evaluate the branches lazily so
that only the branch which is taken need be evaluated:
If : Expr ctxt TyBool ->
Lazy (Expr ctxt a) ->
Lazy (Expr ctxt a) ->
Expr ctxt a
Writing the Interpreter
When we evaluate an Expr
, we’ll need to know the values in scope,
as well as their types. Env
is an environment, indexed over the
types in scope. Since an environment is just another form of list,
albeit with a strongly specified connection to the vector of local
variable types, we use the usual ::
and Nil
constructors so
that we can use the usual list syntax. Given a proof that a variable
is defined in the context, we can then produce a value from the
environment:
data Env : Vect n Ty -> Type where
Nil : Env Nil
(::) : interpTy a -> Env ctxt -> Env (a :: ctxt)
lookup : HasType i ctxt t -> Env ctxt -> interpTy t
lookup Stop (x :: xs) = x
lookup (Pop k) (x :: xs) = lookup k xs
Given this, an interpreter is a function which
translates an Expr
into a concrete Idris value with respect to a
specific environment:
interp : Env ctxt -> Expr ctxt t -> interpTy t
The complete interpreter is defined as follows, for reference. For each constructor, we translate it into the corresponding Idris value:
interp env (Var i) = lookup i env
interp env (Val x) = x
interp env (Lam sc) = \x => interp (x :: env) sc
interp env (App f s) = interp env f (interp env s)
interp env (Op op x y) = op (interp env x) (interp env y)
interp env (If x t e) = if interp env x then interp env t
else interp env e
Let us look at each case in turn. To translate a variable, we simply look it up in the environment:
interp env (Var i) = lookup i env
To translate a value, we just return the concrete representation of the value:
interp env (Val x) = x
Lambdas are more interesting. In this case, we construct a function which interprets the scope of the lambda with a new value in the environment. So, a function in the object language is translated to an Idris function:
interp env (Lam sc) = \x => interp (x :: env) sc
For an application, we interpret the function and its argument and apply
it directly. We know that interpreting f
must produce a function,
because of its type:
interp env (App f s) = interp env f (interp env s)
Operators and conditionals are, again, direct translations into the
equivalent Idris constructs. For operators, we apply the function to
its operands directly, and for If
, we apply the Idris
if...then...else
construct directly.
interp env (Op op x y) = op (interp env x) (interp env y)
interp env (If x t e) = if interp env x then interp env t
else interp env e
Testing
We can make some simple test functions. Firstly, adding two inputs
\x. \y. y + x
is written as follows:
add : Expr ctxt (TyFun TyInt (TyFun TyInt TyInt))
add = Lam (Lam (Op (+) (Var Stop) (Var (Pop Stop))))
More interestingly, a factorial function fact
(e.g. \x. if (x == 0) then 1 else (fact (x-1) * x)
),
can be written as:
fact : Expr ctxt (TyFun TyInt TyInt)
fact = Lam (If (Op (==) (Var Stop) (Val 0))
(Val 1)
(Op (*) (App fact (Op (-) (Var Stop) (Val 1)))
(Var Stop)))
Running
To finish, we write a main
program which interprets the factorial
function on user input:
main : IO ()
main = do putStr "Enter a number: "
x <- getLine
printLn (interp [] fact (cast x))
Here, cast
is an overloaded function which converts a value from
one type to another if possible. Here, it converts a string to an
integer, giving 0 if the input is invalid. An example run of this
program at the Idris interactive environment is:
$ idris2 interp.idr
____ __ _ ___
/ _/___/ /____(_)____ |__ \
/ // __ / ___/ / ___/ __/ / Version 0.6.0
_/ // /_/ / / / (__ ) / __/ https://www.idris-lang.org
/___/\__,_/_/ /_/____/ /____/ Type :? for help
Welcome to Idris 2. Enjoy yourself!
Main> :exec main
Enter a number: 6
720
Aside: cast
The prelude defines an interface Cast
which allows conversion
between types:
interface Cast from to where
cast : from -> to
It is a multi-parameter interface, defining the source type and object type of the cast. It must be possible for the type checker to infer both parameters at the point where the cast is applied. There are casts defined between all of the primitive types, as far as they make sense.