Homotopy Type Theory

After a productive year at the Institute for Advanced Study, the Univalent Foundations Program has written a book on Homotopy Type Theory (HoTT). The foreword gives a succinct description of the purpose of this book:

We did not set out to write a book. The present work has its origins in our collective attempts to develop a new style of "informal type theory" that can be read and understood by a human being, as a complement to a formal proof that can be checked by a machine. Univalent foundations is closely tied to the idea of a foundation of mathematics that can be implemented in a computer proof assistant. Although such a formalization is not part of this book, much of the material presented here was actually done first in the fully formalized setting inside a proof assistant, and only later "unformalized" to arrive at the presentation you find before you — a remarkable inversion of the usual state of affairs in formalized mathematics.

The danger in writing such a book is to fall into the minefields of logic wars. The authors successfully avoided much of these traps, so logicians from other perspectives can read the book without too much cringing. To avoid unnecessary confusion, I recommend mentally substituting most instances of "set theory" by the more apropos "classical mathematics." Readers from strongly opposing points of view should be prepared for a certain amount of propaganda, which is to be expected in a book written to promote one point of view. Barring these caveats, you will find an enjoyable and well-written book on a very interesting subject. Readers should not be too concerned with the word "homotopy" in the title, homotopy theory is not required background for the book though some basic knowledge of the ideas of topology and homotopy theory helps to understand the motivation behind certain concepts.

Having addressed all the necessary caveats, let’s talk about why this book is interesting and why you should read it…

What is so hot about HoTT?

The most interesting aspect from my point of view is that HoTT fully supports proof-relevant mathematics, a way of doing mathematics where proofs are real objects that are manipulated on the same level as numbers, sets, functions and all the usual objects of classical mathematics. This is not a brand new idea, logicians have been playing with proofs in this way for a long while, but HoTT brings this idea into the realm of everyday mathematics and that is a major step forward in mathematics.

The key difference with first-order logic is that equality is not primitive. To define a type \(A\) one must also define what equality means for \(A\). Formally if \(x,y:A\) then \(x =_A y\) (or \(\mathsf{Id}_A(x,y)\)) is another type, an identity type, which can be thought of as made of reasons to identify \(x\) and \(y\). Elements \(p:x =_A y\) are often called "paths" by analogy with topolgy. Indeed, these paths can be inverted and concatenated to realize symmetry and transitivity of equality, respectively; reflexivity is realized by a path \(\mathsf{refl}_x:x =_A x\). Thus each type \(A\) is actually a groupoid rather than a plain set. In fact, since each \(x =_A y\) is itself a type with its own identity types and so forth, the type \(A\) is actually a kind of \(\infty\)-groupoid.

It is this rich structure associated with each type is what permits HoTT to support proof relevant mathematics. To get a basic feel of how this works, the statement "\(x =_A y\) and \(y =_A z\)" is interpreted via the product type \((x =_A y)\times(y =_A z)\), whose elements are pairs of paths that explain why \(x\) is to be identified with \(y\) and why \(y\) is to be identified with \(z\). Similarly, "\(x =_A y\) or \(y =_A z\)" is interpreted via the coproduct type \((x =_A y) + (y =_A z)\), whose elements are either paths that explain why \(x\) is to be identified with \(y\) or paths that explain why \(y\) is to be identified with \(z\). The catch, as you may have guessed from the last example, is that this form of constructive reasoning is intuitionistic and thus not as familiar to mathematicians.

Interestingly, the learning curve for constructive reasoning appears to be much less steep with HoTT than with other constructive frameworks. One of the reasons is that the topological interpretation of the key concepts is very intuitive but more significantly HoTT provides many tools to revert to more familiar territory. The analogue of a plain set in HoTT is a \(0\)-type: a type \(A\) where the identity types \(x =_A y\) always contain at most one path. In other words, these are types where the groupoid structure is trivial and contains no other information than how to handle equality of elements. It is consistent with HoTT that the \(0\)-types form a model of ETCS, a classical theory of sets and functions. Thus, by "truncating" thoughts to \(0\)-types, one can revert to a more familiar classical setting.

What is the big deal with univalence?

It is natural to identify things that are not significantly different. For example, the axiom of extensionality in set theory identifies sets that have the same elements since the elements of a set are all that matter in this context. Extensionality for functions identifies functions that agree on all inputs. Univalence is an indiscernibility axiom in the same spirit: it identifies types that are not significantly different.

To make sense of equality for types, we first need to put them in an ambient type, a universe, with its associated identity types. We can’t have a truly universal type since that directly leads to the usual paradoxes of self-reference. Instead, we have a bunch of universes such that each type belongs to some universe and each universe is closed under the basic type formation rules. Once we have a universe \(\mathcal{U}\) we can talk about equality of types in \(\mathcal{U}\), and because \(\mathcal{U}\) is a type we have a lot of freedom in defining what equality means for types in \(\mathcal{U}\).

This is exactly what the univalence axiom does. It can be stated elegantly: \[(A \simeq B) \simeq (A =_\mathcal{U} B).\] The equivalence relation \({\simeq}\) is similar to isomorphism but it is slightly more permissive. To say \(A \simeq B\) requires the existence of a function \(f:A \to B\) together with \(\ell,r:B \to A\) such that \(\ell \circ f\) is homotopic to \(\mathsf{id}_A\) and \(f \circ r\) is homotopic to \(\mathsf{id}_B\). Given two functions \(f\) and \(g\) of the same dependent product type \(\prod_{x:A} B(x)\), a homotopy from \(f\) to \(g\) is an element of \(\prod_{x:A} (f(x) =_{B(x)} g(x))\). So \(f\) and \(g\) are homotopic if they agree on all inputs, which does not mean that \(f = g\) in the absence of function extensionality.

In general type theory, \(A \simeq B\) is definitely a good way to say "\(A\) and \(B\) are not siginificantly different" and thus univalence arguably captures the right indiscernibility axiom for type theory. The surprising fact is that such a strong axiom does not appear to collapse more than it should. The benefits of univalence are interesting and need to be explored further.

I still need to digest the book so this is probably only the first of many posts on HoTT. The next posts will undoubtedly wander deeper into the technical levels of HoTT. There are a few interesting leads but nothing definite yet.

There is only one thing that bugs me right now, which is the way universes are handled in the book. However, since these assumptions do not appear to be crucial for the development of HoTT and there are plenty of alternatives out there, I’m not overly concerned about this at the moment.

I will eventually need to talk about higher inductive types. These are really interesting and I’m happy to see that the book devotes an entire chapter to them. This is a very interesting outgrowth of this project and which deserves study even independently of HoTT.


Classically valid theorems of intuitionistic analysis

It is well-known that proofs in intuitionistic logic are more constructive than proofs in classical logic. Indeed, this is what the (informal) Brouwer–Heyting–Kolmogorov (BHK) interpretation leads to believe. Thus, intuitionistic proofs tend to have more information content than classical proofs, a fact that is well exploited in proof mining.

This suggests that looking at intuitionistic proofs of classical results from classical analysis may yield some new insights on the constructive nature of these classical results. However, there is one major hurdle with this program: theorems and axioms of intuitionistic analysis are not necessarily classically valid. Thus, inuitionistic proofs must be inspected very closely before drawing any kind of classical conclusions from them. Often, this is done by only looking at intuitionistic proofs over weak systems of intuitionistic analysis whose axioms are all classically valid. This amounts to doing classical analysis with a very limited set of axioms and deduction rules and it basically prohibits the use of the most important methods and results from intuitionistic analysis. This is not very satisfactory, but there is another way. There is a rich class of statements for which intuitionistic analysis is conservative over some weak subtheory whose axioms are all classically valid. So long as we restrict our attention to statements from this class, we can use methods and results of intuitionistic analysis with impunity and draw classical conclusions from the mere existence of a proof in the weaker subsystem, without going through the pain of finding such a proof.

One important result of intuitionistic analysis that is not classically valid is Brouwer’s Continuity Theorem, which simply says that every function defined on the unit interval is (uniformly) continuous [1].  The Continuity Theorem becomes especially powerful when combined with the Axiom of Choice, together they imply that if the statement $\forall\xi\exists\zeta A(\xi,\zeta)$ is true, then there must be a continuous function $F$ such that $\forall\xi A(\xi,F(\xi))$. On the one hand, this combination captures the constructive strength of intuitionistic analysis, as it provides a uniform and effective way to obtain witnesses to existential statements. On the other hand, this combination has some disastrous consequences for classically minded mathematicians. For example, the only subsets of $\R$ that have complements are $\emptyset$ and $\R$. Indeed, if $A \subseteq \R$ has a complement then it has a characteristic function by the Axiom of Choice, which must then be continuous by the Continuity Theorem. Since $\R$ is connected, the only continuous $\set{0,1}$-valued functions on $\R$ are the two constant functions. Therefore, $A$ must either be empty or all of $\R$. In particular, it follows that equality for real numbers is undecidable: $\forall\xi(\xi = 0 \lor \xi \neq 0)$ cannot be true since we cannot partition $\R$ into the singleton zero and the set of nonzero real numbers.

There are many ways to formalize all of this. In my paper [2], I look at two related systems intuitionistic analysis and I exploit conservation results by Anne Troelstra [3] and Jaap van Oosten [4] to draw some classical consequences in subsystems of second-order arithmetic commonly used in reverse mathematics. The main results are of the following form: if $A$ and $B$ satisfy certain syntactic constraints and $$\forall\xi(B(\xi) \lthen \exists\zeta A(\xi,\zeta))$$ is provable in a certain formal system of intuitionistic analysis, then the sequential form $$\forall\xi(\forall n B(\xi_n) \lthen \exists\zeta\forall n A(\xi_n,\zeta_n))$$ is provable in a corresponding classical subsystem of second-order arithmetic. Results of this kind had already been obtained by Jeff Hirst and Carl Mummert for a different class of intuitionistic systems [5]. However, their systems only include classically valid axioms, so the fact that both systems I used prove Brouwer’s Continuity Theorem and some form of the Axiom of Choice adds an interesting new twist to this picture.1

$\newcommand{\EL}{\mathsf{EL}}\newcommand{\GC}{\mathsf{GC}}\newcommand{\RCA}{\mathsf{RCA}}\newcommand{\krf}{\mathrel{\mathtt{rf}}}\newcommand{\cat}{\mathop{{}^\frown}}$The basic intuitionistic system I will use is called $\EL$. The system $\EL$ has two sorts: natural numbers ($\N$) and functions from numbers to numbers ($\N^\N$). I will use roman letters for numbers and greek letters for functions. The details of $\EL$ can be found in [3] and [2]. The axioms of $\EL$ are classically valid. In fact, by adding the law of excluded middle to $\EL$, you obtain a classical system equivalent to the subsystem $\RCA$ of second-order arithmetic. To make this into a full-blown system for intuitionistic analysis, we can add Troelstra’s Generalized Continuity ($\GC$) axiom scheme: $$\forall\xi(B(\xi) \lthen \exists\zeta A(\xi,\zeta)) \lthen \exists\alpha\forall\xi(B(\xi) \lthen \alpha|\xi{\downarrow} \land A(\xi,\alpha|\xi)),$$ where $B(\xi)$ is restricted to the syntactic class $\mathrm{N}_K$ (defined below) but $A(\xi,\zeta)$ is arbitrary. The notation $\alpha|\xi$ is a clever way of coding partial continuous functions $\N^\N\to\N^\N$ due to Kleene that I will describe shortly. Since the class $\mathrm{N}_K$ includes the tautology $0 = 0$, $\EL + \GC$ directly implies the Axiom of Choice: if the set $A_\xi = \set{\zeta \in \N^\N : A(\xi,\zeta)}$ is nonempty for every $\xi \in \N^\N$, then there is an $\alpha$ such that $\alpha|\xi$ picks out a single element of  $A_\xi$ for every $\xi \in \N^\N$. $\EL + \GC$ also implies Brouwer’s Continuity Theorem: if $A(\xi,\zeta)$ describes the graph of a function $F:\N^\N\to\N^\N$, then there must be an $\alpha$ such that for every $\xi$, $\alpha|\xi$ is the unique $\zeta$ such that $A(\xi,\zeta)$. Since $\xi \mapsto \alpha|\xi$ is necessarily continuous, the function $F$ must be continuous.

In order to talk about the relevant conservation result for $\EL + \GC$, I need to talk about Kleene’s realizability with functions. The basic idea of realizability with functions is both natural and beautiful. Unfortunately, the beauty is hidden behind a thick wall of coding tricks. Three standard coding tricks are the following:

  • Pairs of functions can be coded by alternately meshing their values: given functions $\xi_0,\xi_1$ there is a unique function $\seq{\xi_0,\xi_1}$ such that $\seq{\xi_0,\xi_1}\pi_0 = \xi_0$ and $\seq{\xi_0,\xi_1}\pi_1 = \xi_1$, where $\pi_0 = \lambda n.2n$ and $\pi_1 = \lambda n.2n+1$.
  • Number-function pairs can be coded by prefixing the number: given a number $x$ and a function $\xi$ there is a unique function $x\cat\xi$ such that $(x\cat\xi)(0) = x$ and $(x\cat\xi)\sigma = \xi$, where $\sigma = \lambda n.n+1$.
  • Every function $\xi$ can be thought of as encoding an infinite list of functions $\xi_n = \lambda m.\xi(2^n(2m+1)-1)$.

The next trick is a very clever way of encoding partial continuous functions $\N^\N\to\N^\N$ due to Kleene that I alluded to above. Given a function $\xi$ and a number $\ell$, let $\bar{\xi}\ell$ denote the finite initial segment $\seq{\xi(0),\dots,\xi(\ell-1)}$ coded as a number in some primitive recursive fashion. Kleene’s encoding trick is best thought of as a two-step process.

  1. A function $\alpha \in \N^\N$ encodes a partial continuous function $\N^\N\to\N$ as follows: to compute $\alpha(\xi)$ for some $\xi \in \N^\N$, look for the first number $\ell$ (if any) such that $\alpha(\bar{\xi}\ell) \neq 0$ and then set $\alpha(\xi) = \alpha(\bar{\xi}\ell)-1$. Write $\alpha(\xi){\uparrow}$ when no such $\ell$ can be found, and $\alpha(\xi){\downarrow}$ when $\alpha(\xi)$ is defined.
  2. Thinking of functions in $\N^\N$ as encoding number-function pairs $n\cat\xi$, the above scheme lets $\alpha \in \N^\N$ encode a partial continuous function $\N\times\N^\N\to\N$ via $\alpha(n\cat\xi)$. By currying, we obtain a partial continuous function $\N^\N\to\N^\N$. Thus, for each $\xi \in \N^\N$, the value $\alpha|\xi$ of the partial continuous function $\N^\N\to\N^\N$ coded by $\alpha$ is the function $\lambda n.\alpha(n\cat\xi)$ provided that $\alpha(n\cat\xi){\downarrow}$ for every $n$. Write $\alpha|\xi{\uparrow}$ when $\alpha(n\cat\xi){\uparrow}$ for some $n$, and $\alpha|\xi{\downarrow}$ when $\alpha|\xi$ is defined.

This encoding of partial continuous functions is rather dense at first sight, but it is actually quite natural after getting used to it. The $\alpha|\xi$ notation may seem strange, but it is actually well motivated. Thinking of $|$ as a partial binary operation on the set $\N^\N$, we obtain a partial combinatory algebra known as Kleene’s second algebra. The post-modern approach to realizability goes through realizability topos, which can be constructed on top of any partial combinatory algebra.

Kleene’s realizability with functions is a syntactic transformation of formulas which uses all the tricks above to pack most existential quantifiers into an auxiliary function parameter. The transformation is built by induction on complexity as follows2:

  • $\alpha \krf A$ is $A$ for atomic $A$.
  • $\alpha \krf (A \land B)$ is $\alpha\pi_0 \krf A \land \alpha\pi_1 \krf B$.
  • $\alpha \krf (A \lthen B)$ is $\forall\beta(\beta \krf A \lthen \alpha|\beta{\downarrow} \land \alpha|\beta \krf B)$.
  • $\alpha \krf \forall x A$ is $\forall x(\alpha_x \krf A)$.3
  • $\alpha \krf \forall \xi A$ is $\forall \xi(\alpha|\xi{\downarrow} \land \alpha|\xi \krf A)$.
  • $\alpha \krf \exists x A$ is $\alpha\sigma \krf A[x/\alpha(0)]$.
  • $\alpha \krf \exists \xi A$ is $\alpha\pi_1 \krf A[\xi/\alpha\pi_0]$.

Note that the translated formula $\alpha \krf A$ never involves existential quantifiers, except to say that $\alpha|\xi{\downarrow}$ and the scope of this implicit existential quantifiers is quantifier-free. Therefore, $\alpha \krf A$ always belongs to the class of formulas $\mathrm{N}_K$, which is defined as follows:

  • If $A$ is quantifier-free, then $A$, $\exists x A$, $\exists\xi A$ are in $\mathrm{N}_K$.
  • If $A, B$ are in $\mathrm{N}_K$, then so are $A \land B$, $A \lthen B$, $\forall x A$, $\forall \xi A$.

It seems odd to include existential function quantifiers in the first clause but this is harmless since a quantifier-free formula $A$ can only use a finite initial segment of $\xi$ and thus $\exists\xi A$ is easily simulated by an existential number quantifier.

Now that we have described the system $\EL + \GC$ and Kleene’s realizability with functions, we can start tying everything together. The first step is to relate provability in $\EL + \GC$ with Kleene’s realizability with functions.

Troelstra’s Characterization Theorem. For every formula $A$:

  1. $\EL + \GC \vdash A \liff \exists\alpha(\alpha \krf A)$.
  2. $\EL + \GC \vdash A \IFF \EL \vdash \exists\alpha(\alpha \krf A)$.

An important consequence of this characterization of $\krf$ is that $\EL + \GC$ is relatively consistent with $\EL$. In particular, Brouwer’s Continuity Theorem is relatively consistent with the Axiom of Choice so that the above discussion was not meaningless.

The next step is to formulate the relevant conservation result. The class of formulas $\Gamma_K$ is defined as follows:

  • Quantifier-free formulas are in $\Gamma_K$.
  • If $A, B$ are in $\Gamma_K$ then so are $A \land B$, $\forall x A$, $\forall \xi A$, $\exists x A$, $\exists \xi A$.
  • If $A$ is in $\mathrm{N}_K$ and $B$ is in $\Gamma_K$ then $A \lthen B$ is in $\Gamma_K$.

Although $\Gamma_K$ is far from including all formulas, it is actually a rather rich fragment which plays well with Kleene’s realizability with functions.

Troelstra’s Conservation Theorem. If $A \in \Gamma_K$ then $\EL \vdash \exists\alpha(\alpha \krf A) \lthen A$.

Combining this with the second part of Troelstra’s Characterization Theorem, we see that if $A \in \Gamma_K$, then $$\EL + \GC \vdash A \quad\IFF\quad \EL \vdash A.$$ In other words, $\EL + \GC$ is conservative over $\EL$ for formulas in $\Gamma_K$.

What is this all good for? Well, suppose your friend Bertus (a hard-core intuitionist working in $\EL + \GC$) tells you (a hard-core classicist working in $\mathsf{ZFC}$) about his most recent result:

Every $n \times n$ real matrix with nonzero determinant is invertible.

A priory, you have no reason to believe your friend since he also believes in false statements like every function is continuous. After some thought, you realize that $\det(A) \neq 0$ can be formalized as a formula in $\mathrm{N}_K$ and that $$(\forall A \in M_{n\times n}(\R))(\det(A) \neq 0 \lthen (\exists B \in M_{n\times n}(\R))(AB = BA = I))$$ can be formalized as a formula in $\Gamma_K$. You can then conclude that Bertus’s result is classically true and, because of $\GC$, that there is a continuous function to compute the inverse of a $n \times n$ real matrix $A$ with nonzero determinant. Not only were you able to meaningfully communicate with your friend Bertus, but you also gained a new classical result that you can call your own!4

The system $\EL + \GC$ is not the only system for which the above scheme goes through. In [4], Jaap van Oosten proves a similar characterization result for an intuitionistic system that incorporates the Weak König Lemma. This system also proves Brouwer’s Continuity Theorem. However, since the Weak König Lemma and Brouwer’s Continuity Theorem are incompatible with the Axiom of Choice, van Oosten has to settle for a weaker form of choice that continuously selects a nonempty compact set of witnesses to existential statements instead of a single witness. These results of van Oosten are also exploited in my paper [2].


  1. I think that it is probably safe to add certain intuitionistic principles like Brouwer’s Continuity Theorem to the systems considered by Hirst and Mummert, but I have not checked that.
  2. There is no clause for disjunction since because I prefer to think of $A \lor B$ as an abbreviation for $\exists x((x = 0 \lthen A) \land (x \neq 0 \lthen B))$. Since equality of numbers is decidable, this interpretation is intuitionistically correct.
  3. Troelstra [3] actually uses $\forall x(\alpha|\lambda n.x{\downarrow} \land \alpha|\lambda n.x \krf A)$ for this case. This amounts to the same idea except that $\alpha_x$ is always defined whereas $\alpha|\lambda n.x$ might not. There does not appear to be any need for the added complexity except to make this case more parallel to the other universal quantification case.
  4. Don’t start writing that paper right away: this is not a new result… However, you can now captivate your friends for hours explaining how you discovered the existence of Cramer’s Rule without opening a linear algebra textbook!


[1] [doi] L. E. J. Brouwer, “Über Definitionsbereiche von- Funktionen,” Math. ann., vol. 97, iss. 1, pp. 60-75, 1927.
@article {Brouwer27,
AUTHOR = {Brouwer, Luitzen Egbertus Jan},
TITLE = {\"{U}ber {D}efinitionsbereiche von- {F}unktionen},
JOURNAL = {Math. Ann.},
FJOURNAL = {Mathematische Annalen},
VOLUME = {97},
YEAR = {1927},
NUMBER = {1},
PAGES = {60--75},
ISSN = {0025-5831},
MRCLASS = {Contributed Item},
MRNUMBER = {1512354},
DOI = {10.1007/BF01447860},
URL = {http://dx.doi.org/10.1007/BF01447860},
NOTE = {Translated in \cite{VanHeijenoort67}, pp.~446--463},
[2] F. G. Dorais, “Classical consequences of continuous choice principles from intuitionistic analysis,” Preprint, 2011.
@article {Dorais11,
AUTHOR = {Dorais, Fran{\c{c}}ois G.},
TITLE = {Classical consequences of continuous choice principles from intuitionistic analysis},
JOURNAL = {preprint},
YEAR = {2011},
URL = {http://boolesrings.org/dorais/files/2011/10/gc.pdf},
[3] [doi] Metamathematical investigation of intuitionistic arithmetic and analysis, A. S. Troelstra, Ed., Berlin: Springer-Verlag, 1973.
@book {Troelstra73,
TITLE = {Metamathematical investigation of intuitionistic arithmetic and analysis},
SERIES = {Lecture Notes in Mathematics, Vol. 344},
EDITOR = {Troelstra, Anne Sjerp},
PUBLISHER = {Springer-Verlag},
ADDRESS = {Berlin},
YEAR = {1973},
PAGES = {xvii+485},
MRCLASS = {02CXX (02DXX 02HXX)},
MRNUMBER = {0325352 (48 \#3699)},
DOI = {10.1007/BFb0066739},
URL = {http://dx.doi.org/10.1007/BFb0066739},
[4] [doi] J. van Oosten, “Lifschitz’ realizability,” J. symbolic logic, vol. 55, iss. 2, pp. 805-821, 1990.
@article {VanOosten90,
AUTHOR = {van Oosten, Jaap},
TITLE = {Lifschitz' realizability},
JOURNAL = {J. Symbolic Logic},
FJOURNAL = {The Journal of Symbolic Logic},
VOLUME = {55},
YEAR = {1990},
NUMBER = {2},
PAGES = {805--821},
ISSN = {0022-4812},
MRCLASS = {03F50 (03F30 03F55)},
MRNUMBER = {1056390 (92a:03076)},
MRREVIEWER = {Wim Veldman},
DOI = {10.2307/2274666},
URL = {http://dx.doi.org/10.2307/2274666},
[5] [doi] J. L. Hirst and C. Mummert, “Reverse mathematics and uniformity in proofs without excluded middle,” Notre dame j. form. log., vol. 52, iss. 2, pp. 149-162, 2011.
AUTHOR = {Hirst, Jeffry L. and Mummert, Carl},
TITLE = {Reverse mathematics and uniformity in proofs without excluded middle},
JOURNAL = {Notre Dame J. Form. Log.},
FJOURNAL = {Notre Dame Journal of Formal Logic},
VOLUME = {52},
YEAR = {2011},
NUMBER = {2},
PAGES = {149--162},
DOI = {10.1215/00294527-1306163},
URL = {http://dx.doi.org/10.1215/00294527-1306163},
EPRINT = {1010.5165},