In this post, I want to outline some subtle and interesting aspects and issues with respect to universes in HoTT. Some of these issues were brought up in a comment discussion with Mike Shulman some time ago and they came up again more recently in a comment discussion with Andrej Bauer and Alexander Kuklev. I will use the HoTT book as a reference though this post is not intended as a criticism of the handling of universes in the book. Indeed, to serve its purpose, the book is deliberately vague about the exact handling of universes in HoTT. In fact, this post is a good illustration why this is necessary!

Also note that the issues I bring up are mostly irrelevant to conducting everyday mathematics in HoTT. While this post is largely for people interested in technical aspects of HoTT, I will nevertheless try to keep the the post at a medium level of technicality since these aspects of HoTT do come occasionally up in everyday mathematics.

In section 1.3, the HoTT book introduces universes as follows:

[W]e introduce a hierarchy of universes $\mathcal{U}_0 : \mathcal{U}_1 : \mathcal{U}_2 : \cdots$ where every universe $\mathcal{U}_i$ is an element of the next universe $\mathcal{U}_{i+1}$. Moreover, we assume that our universes are cumulative, that is that all the elements of the $i$th universe are also elements of the $(i+1)$st universe, i.e. if $A : \mathcal{U}_i$ then also $A : U_{i+1}.$

In the comment discussion mentioned in the preamble, I expressed concerns with the nature of the subscripts in this description. As a result of this discussion, some minor clarification were made to the book. In particular, the following paragraph was added at the end of section 1.3:

As a non-example, in our version of type theory there is no type family "$\lambda (i:\mathbb{N}).\,\mathcal{U}_i$". Indeed, there is no universe large enough to be its codomain. Moreover, we do not even identify the indices $i$ of the universes $\mathcal{U}_i$ with the natural numbers $\mathbb{N}$ of type theory (the latter to be introduced in §1.9).

And further remarks were added in the notes to chapter 1. The purpose of this post is to explain why this matters and also to explain how fiddling with the structure of universes affects HoTT. I will focus on Classical HoTT (HoTT with the Axiom of Choice for sets and hence the Law of Excluded Middle for propositions) though comparable issues arise without these assumptions.

To illustrate the issue, let’s peek at chapter 10 and in particular at section 10.5, where the book describes the interpretation of the cumulative hierarchy of sets inside HoTT. In any universe $\mathcal{U}_i$ one can find a type $V_i$ that faithfully interprets the cumulative hierarchy and (since we are working in Classical HoTT) Theorem 10.5.11 shows that $V_i$ is a model ZFC. In particular, HoTT proves the consistency of ZFC. In fact, HoTT proves the consistency of even stronger theories! With some work, one can show that each $V_i$ is a level of $V_{i+1}$ whose ordinal rank $\kappa_i$ is an inaccessible cardinal in $V_i.$ Therefore, for each $i,$ Classical HoTT proves the consistency of $\mathsf{ZFC}+ \mathsf{IC}_i,$ where $\mathsf{IC}_i$ abbreviates the statement "there are at least $i$ inaccessible cardinals."

Does this mean that Classical HoTT proves $\forall i\,\operatorname{Con}(\mathsf{ZFC}+ \mathsf{IC}_i)$? No, it doesn’t. To understand this, remember that $\operatorname{Con}(\mathsf{ZFC}+ \mathsf{IC}_i)$ is a purely arithmetical statement which basically states that there is no proof of $0 = 1$ from the axioms of $\mathsf{ZFC}+ \mathsf{IC}_i.$ Hence $\forall i\,\operatorname{Con}(\mathsf{ZFC}+ \mathsf{IC}_i)$ is also an arithmetical statement. The only way for HoTT to make sense of this is for $i$ to range over the type of natural numbers $\mathbb{N}.$ This is explicitly excluded by the remark quoted above from the end of section 1.3.

The reason for this exclusion becomes more visible if we use an alternative way to handle universes, replacing the hierarchy with a rule which simply says that any two types are contained in a common universe. More precisely (but still suppressing many technical details) given any two types $A,B$ we can introduce a universe type $\mathcal{U}$ such that $A, B : \mathcal{U}.$ The above argument showing that Classical HoTT proves the consistency of $\mathsf{ZFC}+ \mathsf{IC}_i$ requires $i$ applications of this rule in order to obtain a sufficiently long chain of universes. Since proofs are finite, a proof can only use the above universe rule finitely often. Therefore, this process does not lead to a proof of $\forall i:\mathbb{N}\,\operatorname{Con}(\mathsf{ZFC}+ \mathsf{IC}_i)$ in Classical HoTT.

In order to get an actual proof of $\forall i:\mathbb{N}\,\operatorname{Con}(\mathsf{ZFC}+ \mathsf{IC}_i)$ in we need to have a hierarchy of universes indexed by $\mathbb{N}.$ More formally, we need a type family $\lambda i:\mathbb{N}\,.\,\mathcal{U}_i$ which will allow us to use induction on $\mathbb{N}$ to obtain an actual proof of $\forall i:\mathbb{N}\,\operatorname{Con}(\mathsf{ZFC}+ \mathsf{IC}_i).$ Doing so, we inadvertently obtain more than we set out. The type family $\lambda i:\mathbb{N}\,.\,\mathcal{U}_i$ must live in some universe $\mathcal{U}$ and the cumulative hierarchy $V$ associated with this $\mathcal{U}$ satisfies $\mathsf{ZFC}+ \mathsf{IC}_\omega,$ where $\mathsf{IC}_\omega$ says that there are infinitely many inaccessible cardinals. Therefore Classical HoTT with the hypothesis of such a family $\lambda i:\mathbb{N}\,.\,\mathcal{U}_i$ proves $\operatorname{Con}(\mathsf{ZFC}+ \mathsf{IC}_\omega),$ which is much stronger than what we set out to do.

In fact, Classical HoTT with $\lambda i:\mathbb{N}\,.\,\mathcal{U}_i$ proves $\operatorname{Con}(\mathsf{ZFC}+ \mathsf{IC}_{\omega+1}),$ $\operatorname{Con}(\mathsf{ZFC}+ \mathsf{IC}_{\omega+2}),$ and so on. Why stop there? What if we have a family $\lambda i:W\,.\,\mathcal{U}_i$ for every wellordering $W$? Can we go beyond? These "large universe hypotheses" are direct analogues of large cardinal hypotheses from set theory. These may seem frivolous at first but such hypotheses necessarily come into play to answer concrete mathematical questions in Classical HoTT such as whether it is possible to extend Lebesgue measure to all subsets of $\mathbb{R}.$

Let me use this opportunity to point out a feature of the alternative way to handle universes using a rule as outlined above: the rule implies that every type belongs to some universe. The book handles this with this brief remark in section 1.3:

When we say that $A$ is a type, we mean that it inhabits some universe $\mathcal{U}_i.$

This is a clever workaround that unfortunately has some side effects. There is no way to formally say "every type belongs to some $\mathcal{U}_i$." The universes $U_i$ are named constants which means that there is no way to quantify over them. Thus effect of the remark is not to prohibit other types but only to declare anything else to be an "illegetimate" type. The intent is to simplify technical issues but it makes types like $\lambda i:\mathbb{N}\,.\,U_i$ illegitimate and effectively blocks the possibility of large universe hypotheses as described above.

Mike Shulman suggested a band-aid for this problem in the discussion mentioned earlier, pointing out that there could be universes, even legitimate ones, other than the named universes $\mathcal{U}_i.$ For example, we could introduce an infinite hierarchy of universes in $\mathcal{U}_{17}$ if we wanted to. This is true but I do not find this fix fully satisfactory. Universes are very special types that are equipped with a lot of technical machinery to allow elements of universes to become actual types and much more. The book mostly suppresses this machinery, and rightfully so since there are many options which are all very technical and not particularly enlightening. Nevertheless, this machinery is implicitly present and it does not appear on its own. My reading of the syntactic hierarchy used in the book is that this machinery is instantiated for each named universe $\mathcal{U}_i,$ specifically. To say that there might be other universe suggests that players don’t have all the rules of the game, which is disconcerting. My alternative proposal with a rule to introduce a universe containing any two types implicitly assumes that there is some form of additional judgment for universes (or similar device) and that this judgment instantiates all the machinery to make a universe. I find this kind of approach more satisfactory since it allows one to add hypotheses such as the existence of $\lambda i:\mathbb{N}\,.\,\mathcal{U}_i$ without changing the rules of the game.

Let me close by mentioning that there are plenty of alternatives for these issues. For example, it is not necessarily desirable for every type to belong to a universe. While formalizing a proof $\forall i:\mathbb{N}\,\operatorname{Con}(\mathsf{ZFC}+ \mathsf{IC}_i)$ above, we accidentally proved a lot more because $\lambda i:\mathbb{N}\,.\,\mathcal{U}_i$ had to live in some universe. If $\lambda i:\mathbb{N}\,.\,\mathcal{U}_i$ doesn’t live in a universe, but induction on $\mathbb{N}$ still works for such types, we can still prove $\forall i:\mathbb{N}\,\operatorname{Con}(\mathsf{ZFC}+ \mathsf{IC}_i)$ but we can’t necessarily prove $\operatorname{Con}(\mathsf{ZFC}+ \mathsf{IC}_\omega)$ as we did above. In fact, there is no need for a single solution: $\mathsf{ZFC},$ $\mathsf{NBG},$ $\mathsf{MK}$ and other set theories have always coexisted pleasantly. However, there is a need to discuss these solutions and to appropriately identify them so that there is no pointless confusion when such aspects of HoTT do matter.

## 11 thoughts on “On the structure of universes in HoTT”

1. Interesting. Let me just mention that people studied “large cardinals” in the contect of type theory. Since ordinals do not play nice constructively, they get replaced by inductive constructions, or rather closure properties under inductive definitions. For instance, a type universe is a type which is closed under dependent sums and products. I think people came up with principles that correspond to Mahlo cardinals (see the work of Anton Setzer), but no further.

You suggested that we simply index universes by ordinals, which probably works in classical HoTT. Without classical logic it is going to get a bit trickier, I suspect. It would be interesting to see if we can formulate inductive constructions or some other device that gets us to measurables.

1. Indeed, there is a fair amount of prior work here. I hope that experts, like Setzer, will eventually contribute their expertise regarding large universes in HoTT. I guess the first question here is whether that existing work is compatible with univalence. The fact that ordinals/wellorderings are not nearly as well behaved without LEM is the main reason why I restricted this post to Classical HoTT but this is not an essential restriction as there are plenty of workarounds.

Note that patterns of universes are not the only way to get large universe hypotheses. (This is also true for large cardinal hypotheses, e.g., $0^\#$ merely states that a particular real number exists.) It makes perfect sense to have a set with a countably additive $\{0,1\}$-measure defined on all of its subsets in Classical HoTT. This gives a measurable cardinal when transferred to the cumulative hierarchy. Because universes don’t happen on their own, this can happen even if there is just a simple hierarchy of universes as described in the book. It’s only by reinterpreting HoTT inside the set-theoretic universe with a measurable cardinal that you can get HoTT with an incredibly rich universe structure.

2. While attempting to dig up old notes on Setzer’s Mahlo universes, I stumbled across a nice paper On Universes in Type Theory by Erik Palmgren wherein he discusses in much greater detail a lot of the issues brought up here (though not in the specific context of HoTT since it was published in 1998).

2. Thanks for this interesting post. I think we made the right decision in not getting into these issues in the book, but it would be nice to have a more consistent and flexible treatment of universes. I think most formal presentations of type theory do consider “being a type” to be a judgment, with a derivation rule which says that if A is an element of a universe, then some a-la-Tarski coercion of it “El(A)” is a type. From this point of view, identifying elements of universes with types is just a convenient abuse of language, and so perhaps universes are not really all that “special” of types after all. Indeed, one direction of possible modifications of the theory (which seems helpful in order to find more sheaf models) is to relax the special-ness of universes even further.

However, I don’t understand what you mean by

Thus effect of the remark is not to prohibit other types but only to declare anything else to be an “illegitimate” type.

In particular, I don’t understand the difference between “prohibiting” something and “declaring it illegitimate” — in fact I find both to be a misnomer, because all that we have are rules which allow things, rather than rules which prohibit things. The example of $\lambda(i:\mathbb{N}) U_i$ is not allowed by the rules of the theory presented in the book. It’s not as if there is a “rule of type formation” which can produce it and then a further “rule of legitimacy” which excludes it. The rule of lambda-abstraction says that if you have an expression belonging to some type which contains a free variable, then you can abstract it over the variable to obtain an element of a (dependent) function type. But “U_i” is not an expression belonging to some single type, so the rule of abstraction does not apply, and so $\lambda(i:\mathbb{N}) U_i$ is not part of the syntax of the theory.

Finally, I’ve said this before elsewhere, but I’ll say it again: I’d like to ask that people not use “HoTT” to refer specifically to the precise type theory used in the first edition of the book. Homotopy type theory refers to the whole subject, which has room for many different formal theories, analogously to how “set theory” includes ZFC, ZF, Z, BZC, NBG, MK, etc. The book made some specific choices for reasons of history, convenience, simplicity, and pedagogy, but other choices are more appropriate in other contexts, and I hope that in the future we will come up with improved homotopical type theories (and perhaps the second edition of the book will use such an improved theory). If you want to refer specifically to the book’s type theory, one possibility I’ve suggested is “HoTT v1.0”.

1. Thanks for your comments Mike! I was hoping you would want to discuss this further.

Yes, that was without a doubt absolutely the right decision for the book. Of course, this doesn’t mean that those things shouldn’t be discussed elsewhere. Such discussions are difficult to find; I suspect they did happen, at the IAS for example, but there is little record of these. My hope for this post is to spark such a discussion.

Before going to the other parts, let me address your last paragraph. In the post, I used HoTT as an abbreviation for Homotopy Type Theory in a general sense. There is a formal system presented in the appendix of the book; I will follow your lead and call that HoTT v1.0 if you want though that’s not what I meant by HoTT in the post where it refers ambiguously to HoTT v1.0, all the variants I suggested and more. The point of the last paragraph is that HoTT has too few formal systems and the various options need to be discussed and compared. This discussion will likely last for a long while. If the development of HoTT follows a similar path to set theory, over the next two decades or so, one formal system will emerge as the dominant one along with some major and minor alternatives. (Fortunately, the development of type theory appears to be happening at a much faster pace than that of set theory.) More importantly, as a result of this discussion, the differences and correspondences between the formal systems that emerge in the process will be well understood.

Regarding your first paragraph. I think you might be misunderstanding the issue I’m bringing up, it’s not about what types are and how they are coerced from universes, the question is: which types are universes and how are they introduced? The precise mechanism for promoting elements of universes to actual types is mostly immaterial. That is best left under a pretty rug but let’s say we use Tarski-style coercions, for the purpose of picking one mechanism. These coercions don’t apply to every type; it makes little sense to coerce $\pi : \mathbb{R}$ to a standalone type. There must be a way to identify which types are universes and therefore have such coercions as well as all other rules associated to universes.

In HoTT v1.0, the solution is to have named constants $\mathcal{U}_0, \mathcal{U}_1,\dots$ each of which is equipped with all the necessary mechanisms. This is done separately for each constant $\mathcal{U}_i$. For example, the $\Pi$-form rules explicitly name the universe $\mathcal{U}_i$ and therefore doesn’t apply to anything other than this $\mathcal{U}_i$; if I introduce a new universe constant, I must then introduce a brand new family of $\Pi$-form instances to accommodate this new universe. This is the kind of thing I meant by saying “players don’t have all the rules of the game” in the post. There is nothing wrong with this approach but it is not very flexible and I find fiddling with basic rules like that disconcerting. Consider what you actually would need to do to the formal system to accommodate a family $\mathcal{V}:\mathbb{N}\to\mathcal{U}_{17}$ consisting of universes…

I think the alternative scheme I proposed is more pleasant (though, admittedly, I haven’t sat down to write a complete formal system). The idea is to have a judgment $\mathcal{U}\ \mathrm{univ}$ (or something similar) with associated rules instantiate all the mechanisms necessary for universes. To ensure that there are enough universes, I proposed an introduction rule of the rough form $A,B\ \mathrm{type} / A, B : \mathcal{U}\ \mathrm{univ}.$ Without additional features, is essentially equivalent to HoTT v1.0 in the sense that the two systems can interpret each other almost directly. However, this is much more flexible: adding a new universe only involves positing a judgment $\mathcal{V}\ \mathrm{univ}$ without changing the basic rules of the system. Even formalizing a type family of universes isn’t that much trouble.

Then again, this is only one possibility. I’m sure there are even better ones though it may take years until we see convergence to solution that minimizes the total amount of awkwardness and inconvenience.

Regarding your second paragraph. The issue with the quote from the book is that “some universe $\mathcal{U}_i$” has no formal meaning: the $\mathcal{U}_i$ are constants, there is no formal way to quantify over them. The closest formal interpretation of that quote is through the context rules. The context rules from the book’s appendix don’t allow us to derive $a : A\ \mathrm{ctx}$ unless $A$ belongs to some universe $\mathcal{U}_i$. But this is not at all the same thing. As you said: “all that we have are rules which allow things, rather than rules which prohibit things.” In fact, illegitimate types don’t pose much problems and since they can’t be derived using the rules HoTT v1.0, they can be entirely nonexistent. But they can also exist and they can actually be quite fun to play with!

I know the notation $\lambda i : \mathbb{N}\,.\,\mathcal{U}_i$ is not meaningful (without a lot of additional context); I was avoiding getting into technical issues by reusing the notation from the quote to suggest an informal idea. I guess the main point there wasn’t completely clear. The argument that Classical HoTT (v1.0) proves $\operatorname{Con}(\mathsf{ZFC} + \mathsf{IC}_i)$ for every $i$ is a perfectly legitimate mathematical argument. Homotopy Type Theory, if it is to be a foundation, should therefore be able to formalize the argument. For purely technical and unavoidable reasons, this argument cannot be formalized into a proof of $\forall i:\mathbb{N}\,.\,\operatorname{Con}(\mathsf{ZFC} + \mathsf{IC}_i)$ in Classical HoTT v1.0. The proof can be formalized using a large universe hypothesis. Unfortunately, HoTT v1.0 makes this kind of argument rather difficult. In fact, a strict reading of certain parts of the book explicitly prohibits this kind of argument. This is mostly a non-issue for the book and these parts do simplify some inessential aspects of formal HoTT. However, this kind of thing is unacceptable for a foundation — a foundation shouldn’t prohibit or otherwise exclude intuitively correct proofs. It’s very hard to read the last paragraph of section 1.3 without thinking that HoTT v1.0 prohibits the existence of a type family of nested universes indexed by $\mathbb{N}$.

1. Well, you’re still saying “prohibits” when I would find it more accurate to say “does not allow”, but I see the point.

For reference, I’d like “HoTT v1.0” to refer to the type theory used throughout the book, which is not completely formalized in the appendix. In particular, it includes all the (higher) inductive types used in the book. Admittedly, the class of allowable HITs is not precisely defined, which means that “HoTT v1.0” is not fully precisely specified, but I think it is more useful to use the phrase this way.

3. If you are using universes a la Tarski, then the $\Pi$-rules are set once and for all, but when you introduce a universe (or a family of them) you need to provide encoding for the $\Pi$ operator, and that is where the extra work comes in (but it is to be expected).

Do we really need a separate judgment for universes? A universe is a structure $(\mathcal{U}, \tau, \pi, \sigma, \ldots)$ where $\mathcal{U}$ is a type, $\tau$ is a type family indexed by $\mathcal{U}$ (the map from codes in $\mathcal{U}$ to types), $\pi$ and $\sigma$ compute codes of dependent products and dependent sums, etc. Then there are equations explaining what $\tau$ does to $\pi$ and $\sigma$. If we make these propositional, we can pack everything up in a structure and talk about it inside type theory.

1. A separate judgment for universe is not entirely necessary. Your approach is similar to the “super-universe” approach. The issue is your type family $\tau$, which has to live somewhere. If you have a super-universe, then $\tau$ can live there. One could say that this just pushes the problem back one layer but specifically implementing Tarski coercions just for the super universe is a nice way to work around a lot of issues. For example, the super-universe doesn’t need to be univalent or have anything complex going on. One possible drawback is that in this case you do have types that don’t live in some universes.

1. $\tau$ does not “have to live somewhere”. In type theory there is no need for every type to belong to a universe.

2. That was poor wording on my part. What I meant is that a type family needs to have a domain ($\mathcal{U}$) and a codomain. The codomain is what I meant by “living somewhere.” The super universe approach (basically having a type $\mathsf{Type}$ without the problematic $\mathsf{Type}:\mathsf{Type}$) is one way around that.

I suppose one can also have “free” type families but I think that requires a more elaborate formal framework. I don’t think I ever looked at a formal development that has these. Are you aware of such a thing? I would love to have a look.