- Lambda calculus syntax
- $\beta$-reduction
- Free variables
- $\alpha$-conversion
- Substitution
- Evaluation: $\beta$ normal form
- Extension: Program Equivalence
- Notes
It's been half year since I last wrote about exploiting the TypeScript type system to solve non-trivial problems. Those problems were relatively simple as far as computational complexity goes - we weren't trying to do everything a Turing machine can do.
This time, we'll solve the computationally-harder problem of emulating the lambda calculus. The (untyped) lambda calculus is a model of computation equivalent to the Turing machine model, which means that the lambda calculus can compute anything a Turing machine canThe typed lambda calculus, which only admits terms that are "well-typed", is not computationally equivalent to the Turing machine. It turns out that all well-typed terms in the typed lambda calculus halt. But there are programs for which a Turing machine does not halt.. In other words, the lambda calculus can simulate any computer algorithm. So, if we can emulate the lambda calculus in the TypeScript type system, we will have a proof that the type system is Turing-completeIt has been well-known for several years that TypeScript's type system is Turing-complete. However, I am not aware of any prior implementations of the lambda calculus in the type system. Perhaps this because it is straightforward (as we will see), or Turing machine emulators are more appealing as proofs.! Furthermore, our construction will not be esoteric; in fact, it will read similarly to typical functional implementations of the lambda calculus.
The lambda calculus is one of the most well-studied topics in computer science, fundamental in computational theory. Almost all functional languages, type systems, and type-based theorem provers are built on an extension of the lambda calculus with types, known as the simply typed lambda calculus. The simply typed lambda calculus is part of a beautiful trilogy known as the Curry-Howard-Lambek correspondence that says that computation in the model of typed lambda calculi, mathematical logic, and category theory are equivalent.
I mention these consequences not because they're particularly important to know, but to say that applications of the lambda calculus across mathematics and computer science are well known. Our construction of the lambda calculus will not be anything new, so if you are already familiar with it, perhaps only the type-level implementation in TypeScript will be interesting. Nevertheless, I will not assume existing knowledge of the lambda calculus. I will, however, assume familiarity with TypeScript's syntax and type system. For a refresher of the latter and an introduction to computation with the TS type system, see my last post on this topic.
Lambda calculus syntax
The untyped lambda calculus is a language consisting of three forms:
An abstraction $(\lam x. {\tt e})$ binds the variable $x$ in the $expr$ ${\tt e}$; that is, it's a one-parameter function. We can't do anything with an abstraction until we have something to apply to it, which is what the application form does. As a consequence of the untyped calculus, the left ${\tt e}$ in the application form can be an arbitrary $expr$, not necessarily an abstraction.
A program in the untyped lambda calculus is just an expression composed from these three forms. Here's one:
"Intuitively", we might reduce this as
but there's one other reduction procedure to $z$ (what is it?). This reduction process is exactly evaluation of the lambda calculus. Sequences of reductions are also called normalization, for reasons that will be made apparent later.
This program doesn't communicate anything useful, but as we mentioned before, the lambda calculus is equivalent to a Turing machine! It's crazy to imagine, but it might suggest we have our work set out for us if we want to implement it in the TypeScript type system.
We'll start off with something easy and fun: the syntax, a pretty-printer, and parser. First, the basic definitions:
ts
typeExpr =Var <string>|Abs <string, any>|App <any, any>;typeVar <X extends string> = {x :X };typeAbs <S extends string,B extendsExpr > = {s :S ,b :B };typeApp <E1 extendsExpr ,E2 extendsExpr > = {e1 :E1 ,e2 :E2 };
Notice the use of any
in the definition of Expr
. Writing Expr
in place of
any
would cause Expr
to be a self-referential type, which TypeScript no
longer permits.
Fortunately, the use of any
here does not detriment our implementation.
Since TypeScript added template literal string types, we can provide a type-level pretty-printer for expressions:
ts
typeStringOfExpr <T extendsExpr >=T extendsVar <string> ? `${T ['x']}`:T extendsAbs <string, any> ?StringOfExpr <T ['b']> extends inferB ?B extends string ?`(λ${T ['s']}. ${B })` : never : never:T extendsApp <any, any> ?StringOfExpr <T ['e1']> extends inferE1 ?E1 extends string ?StringOfExpr <T ['e2']> extends inferE2 ?E2 extends string ?`(${E1 } ${E2 })` : never : never : never : never: never;
We will see patterns of the form StringOfExpr<T['b']> extends infer B ? B extends string
, as above, often in our implementation.
This unravelling of an type to an intermediate inference variable B
aids the
typechecker by instructing it to solve for one unconstrained variable at a
time. Without this guidance, the type instantiation space may balloon exponentially,
causing the checker to give up. Usually this ballooning isn't too bad, but
StringOfExpr
is a pretty complex type, so nested references can grow quite
large.
Template literal types can be pattern-matched by the typechecker, so we can also provide a type-level parser for lambda calculus expressions.
ts
typeParseVar <T > =T extends `${inferX }` ?Var <X > : never;typeParseAbs <T > =T extends `(λ${inferS }. ${inferB })` ?Abs <S ,ParseExpr <B >> : never;typeParseApp <T > =T extends `(${inferE1 } ${inferE2 })` ?App <ParseExpr <E1 >,ParseExpr <E2 >> : never;typeParseExpr <T > =ParseAbs <T > extends never? (ParseApp <T > extends never? (ParseVar <T > extends never? never:ParseVar <T > ):ParseApp <T > ):ParseAbs <T >;
Since our parser is top-down and pattern-based, it is even weaker than a regex-based parser. Indeed, it fails to properly parse nested applications or abstractions on the left. For example,
ts
typeBadParse =ParseExpr <"((λx. x) y)">;
But this is not a post about parsing. We'll live with this solution, and write out expressions in the internal syntax when parsing does a bad job.
$\beta$-reduction
The only place where something can get evaluated in an expression is at an application to an abstraction. The process of evaluating of an abstraction body with an applied argument is called $\beta$-reduction. Explicitly, $\beta$-reduction is the rule
where the substitution $[a/x]b$ replaces free occurrences of the variable $x$ in $b$ with $a$. Expressions to which reductions can be applied are called redexes (reducible expressions); in particular, expressions to which $\beta$-reductions can be applied are called $\beta$-redexes.
Performing substitution is a bit tricky, so before we give a type-level algorithm to do so, we should talk about some subtle cases.
Free variables
What do we mean by free occurrences of a variable? Consider $((\lam x. (x\ (\lam x. x)))\ (y\ z))$. If we were to perform the substitution $[(y\ z)/x]\hspace{0pt}(x\ (\lam x. x))$ by replacing all occurrences of $x$ we wouldn't even get a syntactically-valid expression! If we replaced all variables named $x$ we would get $((y\ z)\ (\lam x. (y\ z)))$, which is incorrect because the innermost abstraction is no longer the identity function. Rather, we only want to replace the first variable $x$, as it is free of binding by any enclosing abstraction. The second variable $x$ is bound to an enclosing abstraction, so we should not touch it during substitution. Distinguishing between free and bound variables this way, we see that the correct reduction is $((y\ z)\ (\lam x. x))$.
Discovery of free variables in an expression will be important in for our
evaluator. We can construct a recursive predicate Free
that
collects all variables into a union of literals by traversing a type bottom-up.
The only variables we need to exclude from this collection are those bound by an
abstraction.
ts
typeFree <T extendsExpr >=T extendsVar <inferX > ?X :T extendsAbs <inferS , inferB > ?Exclude <Free <B >,S >:T extendsApp <inferE1 , inferE2 > ?Free <E1 >|Free <E2 > : never;typeFreeTest1 =Free <ParseExpr <"(x z)">>;typeFreeTest2 =Free <ParseExpr <"(λy. (x (y z)))">>;
$\alpha$-conversion
There's another tricky case in substitution. Consider $[x/y]\hspace{0pt}(\lam x. y)$. This time, a literal replacement yields the identity function $(\lam x. x)$ rather than a constant function to $x$ in the outer scope. If only the bound $x$ had been named something else, say $x'$, we would end up with the constant function $(\lam x'. x)$ we expect.
Indeed, there is no reason we can't arbitrarily rename bound variables in such cases. For example, $(\lam x. (\lam y. (x\ y)))$ and $(\lam x. (\lam z. (x\ z)))$ differ in the name of their second bound variables, but are computationally equivalent. Renaming bound variables to computationally equivalent expressions is called $\alpha$-conversion. We say that two expressions are $\alpha$-equivalent if they are syntactically equivalent, up to $\alpha$-conversion. You may be familiar with $\alpha$-conversion and $\alpha$-equivalence if you've ever renamed a variable in a codebase; hopefully it didn't cause a failure!
We'll return to $\alpha$-equivalence later on when talking about program equivalence. Regarding our present issue, it is enough to stipulate that when we have a $\beta$-redex $((\lambda x. b)\ a)$ where $x\in\text{Free}(a)$, we should choose a fresh variable name for $x$ and perform $\alpha$-conversion on the abstraction before proceeding. The only requirement is that we choose a variable name that is not already free in $a$ or $b$, otherwise we end up with the same problem again (in the former case) or accidentally binding a free variable (in the latter case). The popular strategy is to tack on single quotes until these requirements are met.
Here's a type constructor to find a fresh variable name given a collection (union) of reserved names.
ts
typeFreshen <X extends string,Used > =X extendsUsed ?Freshen <`${X }'`,Used > :X ;constfreshen_test1 :Freshen <"x", "x"|"y"|"x'"> = `x''`;
Substitution
We now use our notes from the two sections above to come up with a proper
algorithm for the right hand side of $\beta$-reduction, via a type constructor
Subst<X, A, E>
that performs the substitution [A/X]E
.
In the interest of a performant type-level algorithm that the typechecker will be willing to compute, we make some adjustments to the high-level ideas previously discussed.
Rather than explicitly performing alpha conversion on an abstraction body every time its binder (say $x$) interferes with a free variable in the replacement expression, we generate a fresh binder name (say $x'$) and store the mapping $x\to x'$ in a "substitution context" we pass on to substitution of the abstraction body. Then, whenever we see a variable, if it is in the substitution context (either because it is the original substitution target, or needs to undergo $\alpha$-conversion), we replace it with its mapping. Otherwise, the variable is untouched. This implicit $\alpha$-conversion as we go requires only one pass of the expression, whereas explicit $\alpha$-conversion everytime it is needed would result in a quadratic substitution algorithm.
To avoid having to compute the free variables of the expression we wish to substitute everytime we need to check a binder for $\alpha$-conversion, we compute the collection of free variables once and pass it down to nested expressions as neededAt an abstraction, why don't we need to check if the binder is free in the right-hand mappings of all variables in our context? Hint: Recall that other than the substitution $[X\to A]$, there is only one other kind of mapping in the context..
ts
typeSubst <X extends string,A extendsExpr ,E extendsExpr >=Subst_ <X ,E , [[X ,A ]],Free <A >>;typeSubst_ <X extends string,E extendsExpr ,Subs extendsCtx <Expr >,FreeInA >=E extendsVar <inferU > ?(Find <Expr ,U ,Subs > extends inferR ?R extendsExpr ?R :E : never):E extendsApp <inferE1 , inferE2 > ?Subst_ <X ,E1 ,Subs ,FreeInA > extends inferE11 ?E11 extendsExpr ?Subst_ <X ,E2 ,Subs ,FreeInA > extends inferE21 ?E21 extendsExpr ?App <E11 ,E21 > : never : never : never : never:E extendsAbs <inferS , inferB > ? (// Case 1: S = X, shadowing X, so no need to substitute at all.S extendsX ?E :// Case 2: S \in Free(A), so naively substituting A in the body will// change what S in A refers to. For example, [x->s](λs. x) => λs. s,// which is incorrect. Instead, come up with a fresh name S->S1.S extendsFreeInA ?(FreeInA |Free <B > extends inferFreeXB ?Freshen <S ,FreeXB > extends inferS1 ?S1 extends string ?Subst_ <X ,B , [[S ,Var <S1 >], ...Subs ],FreeInA > extends inferB1 ?B1 extendsExpr ?Abs <S1 ,B1 > : never : never : never : never : never) :// Case 3: A doesn't conflict with the introduced binder S, so direct// substitution is fine.Subst_ <X ,B ,Subs ,FreeInA > extends inferB1 ?B1 extendsExpr ?Abs <S ,B1 > : never : never): never
Let's write some tests to check our work.
ts
typeSubstTest <X extends string,E extends string,B extends string> =Subst <X ,ParseExpr <E >,ParseExpr <B >> extends inferR ?R extendsExpr ?StringOfExpr <R > : never : never;constsubst_test1 :SubstTest <"x", "y", "(λz. x)"> = "(λz. y)";constsubst_test2 :SubstTest <"x", "y", "(λy. x)"> = "(λy'. y)";constsubst_test3 :SubstTest <"x", "y", "(λx. x)"> = "(λx. x)";
Contexts
Hopefully, the completist side of you is nagging at this Ctx
type constructor -
we mentioned contexts, but how exactly are they implemented? And how does Find
work?
As case 2 of substitution in abstractions hints, we implement contexts
as associative lists backed by type tuples. If you
know associative lists from OCaml or Lisp, this should be familiar.
Each element of the type tuple is a "mapping" represented by a two-element
tuple. For our purposes contexts only need to map from strings to
something else, so we parameterize Ctx
s by only the value type.
ts
typeCtx <T > = [string,T ][];
Find
is pretty straightforward, finding the first mapping from the front of
the context. This is why we add new mappings to the front of a context.
Since we can't throw type-level errors, we report failures to find a mapping
for a key with a special NotFound
type.
ts
typeNotFound = {__brand : 'Not found'}typeFind <T ,X extends string,L extendsCtx <T >> =L extends [[inferY , inferN ], ...inferB ] ?(Y extendsX ?N :B extendsCtx <T > ?Find <T ,X ,B > : never):NotFound ;constfind_test :Find <number, "a", [["b", 10], ["a", 7], ["c", 15]]> = 7;
Evaluation: $\beta$ normal form
As we've seen, $\beta$-reduction is the only meaningful computation to make in an expression. This leads naturally to a notion of the $\beta$ normal form of an expression, in which there no $\beta$-reductions are possible anywhere in the term - not in either side of applications, nor inside abstractionsThere are weaker notions of normal form, for example an expression is in weak head normal form (whnf) if there are no $\beta$-reductions possible in the head position (i.e. top-level applications), but there may be $\beta$-reductions possible inside arguments to applications or in the bodies of abstractions. Whnf is the reduction strategy of Haskell's evaluation semantics; this is part of what is necessary for lazy evaluation and sharing of computational results..
Let's look at some examples.
- As we saw before, the expression $((\lam x. x)\ ((\lam y. y)\ z))$ has a $\beta$-redex in head position. If we apply this reduction we get $((\lam y. y)\ z)$, which has another $\beta$-redex in head position. After we apply it, we get $z$, which has no further $\beta$-reduction, and so $z$ is the $\beta$ normal form of these expressions.
- The expression $(x\ ((\lam y. y)\ z))$ does not have a $\beta$-redex in head position, but there is one in the argument to the top-level application. We apply the reduction to get $(x\ z)$, which is in $\beta$ normal form.
- The expression $(\lam x. ((\lam y. y)\ z))$ has no $\beta$-redex in head or argument position, but there is one inside the top-level abstraction. Applying the reduction yields $(\lam x. z)$, which is in $\beta$ normal form.
An important decision to make is what order we should apply $\beta$-reductions in. As we mentioned when first introducing the $\beta$-reduction, $((\lam x. x)\ ((\lam y. y)\ z))$ has multiple normalization sequences - either by applying the reduction in head position first, or the one in argument position first. Either way we go we, normalization to $\beta$ normal form ends up at the expression $z$. Unfortunately, in general we may not be so lucky. One example (taken from Wikipedia) is
If we try to apply the $\beta$-redex in argument position first, we step to the expression
with a larger redex in argument position. Indeed, if we keep trying to reduce in argument position, our normalization sequence will never halt. But there is a $\beta$ normal form for this expression - if only we had reduced the $\beta$-redex in head position first, we would wind up with just $z$, since $x$ is never referenced in the left-hand abstraction body.
It turns out that applying $\beta$-reductions in the leftmost, outermost position before considering arguments to the right or abstraction bodies one level down is the "best" normalization strategy in the sense that if a $\beta$ normal form for an expression exists, this strategy always finds it. This evaluation strategy is appropriately named normal order; the seminal result is known as the Standardization Theorem, proven by Curry and Feys in 1958.
With a strategy in our hands, we can now describe an algorithm for evaluating to $\beta$ normal form explicitly. Given an expression, we first check if the top-level is a $\beta$-redex. If it is, we perform the $\beta$-reduction and compute the $\beta$ normal form of the resulting expression. Otherwise, the top-level expression must be a variable, an abstraction, or an application with a non-abstraction on the left-hand side, or an abstraction. In the first case there is nothing more to do. In the second case, we compute the $\beta$ normal form of the abstraction body and re-package the abstraction. In the third case, we compute the $\beta$ normal form of the application argument and re-package the application.
ts
typeBetaNF <E extendsExpr >=E extendsApp <inferF , inferA > ?(BetaNF <F > extends inferF ?F extendsExpr ?(F extendsAbs <inferS , inferB >? (Subst <S ,A ,B > extends inferB1 ?B1 extendsExpr ?BetaNF <B1 > : never : never):BetaNF <A > extends inferA1 ?A1 extendsExpr ?App <F ,A1 > : never : never) : never : never):E extendsAbs <inferS , inferB > ?BetaNF <B > extends inferB1 ?B1 extendsExpr ?Abs <S ,B1 > : never : never:E ;
As always, we write some tests to check our implementation:
ts
typeBetaNFTest <E extendsExpr > =BetaNF <E > extends inferPE ?PE extendsExpr ?StringOfExpr <PE > extends inferR ?R extends string ?StringOfExpr <E > extends inferE ?E extends string ?`${E } => ${R }` : never : never : never : never : never : never;constnf_test1 :BetaNFTest <ParseExpr <"(x y)">>= "(x y) => (x y)";constnf_test2 :BetaNFTest <App <Var <"x">,App <Abs <"y",Var <"y">>,Var <"z">>>>= "(x ((λy. y) z)) => (x z)";constnf_test3 :BetaNFTest <Abs <"x",App <Abs <"y",Var <"y">>,Var <"z">>>>= "(λx. ((λy. y) z)) => (λx. z)";constnf_test4 :BetaNFTest <App <Abs <"x",Var <"z">>,App <Abs <"w",App <App <Var <"w">,Var <"w">>,Var <"w">>>,Abs <"w",App <App <Var <"w">,Var <"w">>,Var <"w">>>>>>= "((λx. z) ((λw. ((w w) w)) (λw. ((w w) w)))) => z";
That's it! We now have a type-level evaluator for the lambda calculus to $\beta$ normal form. This is enough to show that TypeScript's type system is Turing-complete.
Extension: Program Equivalence
One way of viewing an evaluator to $\beta$ normal form is as a checker of equivalence between expressions, up to $\beta$-reduction. In general, evaluators of programs in any language never introduce anything "new"; they simply yield equivalent, (hopefully) simpler forms of their inputs under some set of evaluation rules.
This is one reason normal forms need to be described formally and evaluation strategies need to be chosen carefully. After all, $((\lam x. z)\ ((\lam w. (w\ w)\ w)\ (\lam w. (w\ w)\ w)))$ and $((\lam y. z) a)$ reduce to the same $\beta$ normal form if we use a normal order evaluation strategy, but are certainly not equivalent if we apply $\beta$-reductions in argument position first!
As a more familiar example, the TypeScript functions
ts
functionstrip_impl1 (what : string) {return function(from : string) {returnfrom .split (what ).join ('');}}functionstrip_impl2 (what : string) {return function(target : string) {while (target .indexOf (what ) >= 0) {target =target .replace (what , '');}returntarget ;}}
are equivalent for all string inputs what
, up to $\alpha$-conversion of their
returned closure parameters and reduction of the expressions in the closure bodies. But if
we don't reduce the closure bodies, it is very hard to prove that they produce
the same output for all inputs without enumerating and reducing all string inputs, an impossible task.
It is for this reason we chose to evaluate to $\beta$-normal form, which looks inside abstraction bodies and unapplied arguments. Furthermore, our normal order evaluation strategy for the lambda calculus yields a normal form whenever one exists, so two expressions are always comparable whenever there is some route for their termination.
Can we be confident that two lambda calculus expressions are equivalent when their $\beta$-normal forms are equal? Only if by equality we mean equivalence up to $\alpha$-conversion! After all, we don't want to say two programs are semantically different just because we changed a variable name. Since we already have a $\beta$-normal form evaluator, we just need to complete an algorithm for calculating $\alpha$-equivalence, as promised earlier.
$\alpha$-equivalence
Since $\alpha$-equivalence measures syntactic equality of two terms up to the renaming of bound variables, we only have to concern ourselves with some way of normalizing bound variables at abstraction sites. The typical approach, which we will employ, is to recurse on the structure of expressions being compared in parallel. If the structures differ (for example a variable vs an application), the expressions cannot possibly be $\alpha$-equivalent. Otherwise, there are three remaining cases of parallel forms.
For applications, we simply check $\alpha$-equivalence of corresponding subexpressions.
For abstractions, we normalize introduced bound variables in each expression by creating a fresh id and mapping each bound variable to that id in a context carried alongside each respective expression. The fresh id we create can be of any form; in our implementation, we use type-level integers.
When comparing two variables, we check if both have mappings to some id in their respective contexts. If they do, and the mapped ids are equal, then the variables are equivalent up to $\alpha$-conversion of some enclosing abstraction. If neither variable has a mapping in its respective context, then both variables are free in all enclosing abstractions; for them to be $\alpha$-equivalent, they must have the same name. Otherwise, one variable is free and the other is bound in some enclosing abstraction, so they cannot be $\alpha$-equivalent.
ts
typeAlphaEq <E1 extendsExpr ,E2 extendsExpr ,L1 extendsCtx <number> = [],L2 extendsCtx <number> = [],Fr extends number = 0>=E1 extendsVar <inferX1 > ? (E2 extendsVar <inferX2 >? (Find <number,X1 ,L1 > extends number? (Find <number,X2 ,L2 > extends number?Find <number,X1 ,L1 > extendsFind <number,X2 ,L2 > ? true : false: false): (Find <number,X2 ,L2 > extends number? false:X1 extendsX2 ? true : false)): false):E1 extendsApp <inferE11 , inferE12 > ? (E2 extendsApp <inferE21 , inferE22 > ?AlphaEq <E11 ,E21 ,L1 ,L2 ,Fr > extends true ?AlphaEq <E12 ,E22 ,L1 ,L2 ,Fr > extends true ?true : false : false : false):E1 extendsAbs <inferS1 , inferB1 > ? (E2 extendsAbs <inferS2 , inferB2 > ?Incr <Fr > extends inferFr1 ?Fr1 extends number ?AlphaEq <B1 ,B2 , [[S1 ,Fr1 ], ...L1 ], [[S2 ,Fr1 ], ...L2 ],Fr1 > : never : never : false): never
Here are some tests:
ts
typeAlphaEqTest <E1 extends string,E2 extends string> =AlphaEq <ParseExpr <E1 >,ParseExpr <E2 >>;constalpha_eq_test1 :AlphaEqTest <"(λx. x)", "(λy. y)"> = true;constalpha_eq_test2 :AlphaEqTest <"(λx. z)", "(λy. z)"> = true;constalpha_eq_test3 :AlphaEqTest <"(λz. (λx. (z x)))", "(λx. (λw. (x w)))"> = true;constalpha_eq_test4 :AlphaEqTest <"(y (λx. (λz. (z x))))", "(y (λx. (λw. (w x))))"> = true;constalpha_eq_test5 :AlphaEqTest <"(u (λx. (λz. (z x))))", "(y (λx. (λw. (w x))))"> = false;
Type-level numbers
Our implementation of type-level numbers is backed by tuples for arithmetic. To
perform addition, we deserialize a number type n
to an n
-length tuple type,
add m
entries to the tuple type, and serialize it back to an n+m
number
type.
ts
typeTupToNum <T extends unknown[]> =T extends {length : inferN } ?N : never;typeNumToTup <N extends number,Tup extends unknown[] = []> =Tup extends {length :N } ?Tup :NumToTup <N , [...Tup , unknown]>;typeIncr <N extends number> =TupToNum <[...NumToTup <N >, unknown]>;constincr_test :Incr <11> = 12;
$\alpha\beta$-equivalence
With implementations of evaluation to $\beta$ normal form and $\alpha$-equivalence in place, checking for program equivalence under $\beta$-reduction and $\alpha$-conversion is trivial. Since $\alpha$-conversion cannot introduce new $\beta$-redexes, comparing two expressions consists of normalization followed by a check for $\alpha$-equivalence.
ts
typeABEq <E1 extendsExpr ,E2 extendsExpr > =BetaNF <E1 > extends inferE1 ?E1 extendsExpr ?BetaNF <E2 > extends inferE2 ?E2 extendsExpr ?AlphaEq <E1 ,E2 > : never : never : never : never;
Proving arithmetic equivalences
As a comprehensive application of all we've built up, let's show off what all introductions to the lambda calculus show off - simple arithmetic.
There's no syntax for the natural numbers in the lambda calculus language we've worked on, but there is a classic encoding of the naturals known as the Church numerals. Wikipedia gives a very good overview of the encoding and its presentation of various arithmetic operations.
In short, the idea is to encode the Peano axioms for natural numbers as abstractions. The Peano axioms define a symbol $0$ and a 1-parameter function $\S$ called the successor. We take as axioms that $0$ is a natural number, and that if $n$ is a natural number, $\S(n)$ is a natural numberThere are many other axioms, but they are not relevant here.. We can interpret $\S(0)$ as $1$, $\S(\S(\S(0)))$ as 3, and so on.
The church encoding represents the $n$th natural number as a second-order abstraction that takes a "successor function" and a "zero element" and yields an $n$-ary composition of the successor applied to the zero element, as in the Peano axioms. Taking $s$ to bind the "successor function" and $z$ to bind the "zero element", we can represent the first few natural numbers as follows.
It's important to note that, for example, $2$ is represented by the abstraction $(\lam s. (\lam z. (s\ (s\ z))))$, not by its application to some yet-unknown successor and zero arguments. Said another way, the $s$ and $z$ arguments here are only in name, axiomatized into existence.
However, applications to these numerals do come into play when we want to perform arithmetic operations, like addition. Take the church encoding of $m + n$, which is
If we apply a "reverse $\beta$-reduction", known as a $\beta$-abstraction, we can rewrite the above expression to the equivalent form
Now, we parameterize this expression on church numerals $m$ and $n$, where $m\sim(\lam s. (\lam z. (\underbrace{s\ (\ldots\ (s}_{m}\ (z))))))$. This yields an equation for the addition of $m$ and $n$.
Our evaluator can prove the equality of certain expressions under addition, if we take $\alpha\beta$-equivalence as equality. Below we demonstrate this for the proposed equivalence $1 + 1 \overset{\mbox{$\alpha\beta$}}{\equiv} 2$.
ts
typeABEqTest <E1 extendsExpr ,E2 extendsExpr > =StringOfExpr <E1 > extends inferSE1 ?SE1 extends string ?StringOfExpr <E2 > extends inferSE2 ?SE2 extends string ?ABEq <E1 ,E2 > extends true ?`${SE1 } ≡ ${SE2 }` : `${SE1 } ≢ ${SE2 }` : never : never : never : never;typeZero =ParseExpr <"(λs. (λz. z))">;typeOne =ParseExpr <"(λs. (λz. (s z)))">;typeTwo =ParseExpr <"(λs. (λz. (s (s z))))">;typeAdd =Abs <"m",Abs <"n",Abs <"s",Abs <"z",App <ParseExpr <"(m s)">,App <ParseExpr <"(n s)">,Var <"z">>>>>>>;constabeq_test1 :ABEqTest <App <App <Add ,One >,One >,Two >= "(((λm. (λn. (λs. (λz. ((m s) ((n s) z)))))) (λs. (λz. (s z)))) (λs. (λz. (s z)))) ≡ (λs. (λz. (s (s z))))";
With the notion of program equivalence and this example, we can start to see glimmers of the Curry-Howard correspondenceBut whatever correspondence we might interpret here is really just a glimmer; we cannot get very far until we add types on top of the lambda calculus. Dependent type theories in particular provide a correspondence to all sorts of things in classical and intuitionist logic. With a bit of knowledge about Cartesian closed categories and adjunctions, we get even farther, to the trilogy we mentioned in the introduction.. Also, Peano arithmetic is Turing-complete, so if we wanted to, we could provide yet another proof that TypeScript's type system is Turing-complete by modeling the rest of Peano arithmetic in the lambda calculus (which is doable, if not tedious).
Notes
The trickiest and "ugliest" part of our evaluator is the procedure of substitution and $\alpha$-conversion. Indeed, it feels a bit unpleasant to have to deal with collisions between variable names, which are a syntactic concern, when performing substitution, which is a semantic concern. By using a nameless representation of bound variables, like de Bruijn indices, we trade off the inconveniences of $\alpha$-conversion for the semantically-more-appealing operation of "shifting" variable indices to account for changes in enclosing abstractions. de Bruijn indices are a bit tricky to internalize so we didn't use them here. For a good introduction on de Bruijn indices and various typed extensions of the lambda calculus see Pierce's classic Types and Programming Languages.
Besides $\alpha$ and $\beta$ reductions, there is a third common, fundamental operation in the lambda calculus known as $\eta$-equivalence. $\eta$-equivalence says that for all $f$, $(\lambda x. (f\ x)) \equiv f$. This leads to the rewrite rule $(\lambda x. (f\ x)) \Ra f$, known as $\eta$-reduction, and to the reverse rule $f \Ra (\lambda x. (f\ x))$, known as $\eta$-abstraction. $\eta$-reduction is useful for cleaning up code with lots of "unnecessary" abstractions and converting lazy (thunked) code into eagerly-evaluated code. Conversely, $\eta$-abstraction makes eager evaluation lazy. Importantly, $\eta$-equivalence communicates the extensional equivalence of functions, specifying that two functions are equivalent (more strictly, equal) iff they yield equivalent (more strictly, equal) outputs for all inputs. This equivalence is sometimes used for optimizations in compilers of functional languages. An expression is said to be in $\beta\eta$ normal form when no $\beta$ or $\eta$ reductions are possible. We did not explore $\eta$-reductions here because it does not impact our notion of program equivalence (why?).
Lastly, an unpleasant implementation detail is our constant unwrapping of
recursive type constructors in the pattern of Rec<T> extends Infer1 ? Infer1 extends Target
. As mentioned before, this verbose expansion is needed
to help guide the typechecker in computing type instantiation efficiently. I
could not find any way to get around this pattern except in a few places. If you
find a better implementation, please let me know.