This is a followup to my recent post on the structure of universes in HoTT. In this post I will outline one of the possible alternative ways of handling universes in HoTT, which I will colorfully call Super HoTT for the time being. This is not an original idea; something like that (not in the context of HoTT) can be found in Erik Palmgren’s paper On Universes in Type Theory.^{1} The idea comes from a comment by Andrej Bauer where he described a universe as "a structure \((\mathcal{U},\tau,\pi,\sigma,\ldots)\) where \(\mathcal{U}\) is a type, \(\tau\) is a type family indexed by \(\mathcal{U},\) \(\pi\) and \(\sigma\) compute codes of dependent products and dependent sums, etc." This is a perfectly reasonable way of describing universes. By the Foundational Thesis of HoTT, this approach should be formalizable in some flavor of HoTT. This is indeed possible and the flavor (or rather flavor family) Super HoTT is one way to do that.
The issue with Bauer’s description is the type family \(\tau.\) Formally, a type family indexed by a type \(\mathcal{U}\) is an element \(\tau:\mathcal{U}\to\mathcal{V},\) where \(\mathcal{V}\) is some larger universe. This type \(\tau\) becomes a type family only after applying Tarskistyle coercion associated to \(\mathcal{V}\) that promotes elements of \(\mathcal{V}\) to actual types. On the face of it, this is circular: \(\tau\) is the Tarskistyle coercion associated to the universe \(\mathcal{U},\) which is defined in terms of that of \(\mathcal{V},\) which…
One way to break this circularity is to have a designated super universe \(\mathcal{U}^\ast\) — hence the name Super HoTT — whose associated Tarskistyle coercion and associated machinery are implemented in the formal system itself. In particular, \(\mathcal{U}^\ast\) is associated to a primitive decoding function \(T\) such that \(T(A)\ \mathsf{type}\) for any element \(A:\mathcal{U}^\ast.\) In Super HoTT, this is the only such decoding function. Other universes arise à la Bauer, i.e., as structures \((\mathcal{U},\tau,\pi,\sigma,\ldots)\) where now \(\tau:\mathcal{U}\to\mathcal{U}^\ast\) is a type family that realizes this universe’s Tarskistyle coercion via composition with the super universe’s decoding function \(T.\) In fact, since there is only one super universe, the Russellstyle approach doesn’t lead to many problems. In other words, there is little harm in suppressing the decoding function \(T\) in favor of a simple Russellstyle rule \[\frac{A:\mathcal{U}^\ast}{A\ \mathsf{type}}.\] In fact, the \(\mathsf{type}\) judgment is often informally confounded with an actual universal type \(\mathsf{Type}.\) This type of identification can lead to serious issues such as Girard’s Paradox; the super universe approach is one of the many safe ways around these problems.
So far, this is pure type theory, we haven’t said anything HoTT specific. To make this a flavor of HoTT, we postulate that there are enough univalent universes or even lots of them. Universes à la Bauer are mere structures, they can be univalent or not, they can have extra structure or lack some structure. Univalent universes can be identified among these structures so we can formalize a rule saying that any two \(A,B:\mathcal{U}^\ast\) are contained in a univalent universe. One can also formalize rules saying that one can have cumulative univalent universes, cumulative univalent universes with propositional resizing, cumulative classical univalent universes, etc. The possibilities for specialized flavors of Super HoTT are vast and varied!
Note that the super universe doesn’t have to be univalent. It doesn’t need to have much fancy structure at all. It’s not even necessary for \(\mathcal{U}^\ast\) to have a natural number type. All of that can happen inside the univalent universes à la Bauer. In fact, a barebones super universe is probably more flexible framework than a rich one. Different choices of structure for the super universe and how it interacts with the universes inside it will lead to different flavors of Super HoTT with variable strength. Richer super universe structure allows one to express stronger and stronger properties for the univalent universes it contains.
One advantage of this approach is that universes can arise "on their own" in the sense that any suitable structure \((\mathcal{U},\tau,\pi,\sigma,\ldots)\) is a universe, no matter how it comes about. Though this can be a mixed blessing if one wants more handson control over which types are universes. One potential drawback of Super HoTT is that not all types can belong to univalent universes. In particular, the super universe \(\mathcal{U}^\ast,\) cannot belong to any univalent universe inside \(\mathcal{U}^\ast.\) In this way, Super HoTT is more like \(\mathsf{NBG}\) or \(\mathsf{MK}\) than \(\mathsf{ZF}.\) For ontological reasons, it might be preferable to have a more selfcontained system analogously to \(\mathsf{ZF}.\) In any case, Super HoTT is one of the more flexible formal ways of handling universes in HoTT and it is a perfectly viable alternative to the formal system presented in the appendix of the HoTT book.^{2}
Notes

E. Palmgren, On universes in type theory. In Twentyfive years of constructive type theory (Venice, 1995), Oxford Logic Guides 36, Oxford University Press, New York, 1998, pages 191–204.

Univalent Foundations Program, Homotopy Type Theory: Univalent Foundations of Mathematics, Institute for Advanced Study, Princeton, 2013.
Interesting. So your point is that I cannot “bundle together” the universe structure $(\mathcal{U}, \tau, \pi, \sigma, \ldots)$ unless $\tau$ lands somewhere. Let me try to explain for my own benefit and that of others: in type theory it is perfectly ok to have a primitive judgment “is a type” in which case something like $\tau$ can just be a type family, i.e., its validity is established as the judgment “in the context $a : \mathcal{U}$ the term $\tau(a)$ is a type”. However, as soon as I want to make a structure $(\mathcal{U}, \tau, \pi, \sigma, \ldots)$ then the structure must have a type, and then asking “what is the type of the second component?” forces $\tau$ to have a codomain.
There are type theories which distinguish between types and kinds. The distinction is akin to that of sets and classes. Would it make sense to do it that way so that $\mathcal{U}$ would be a kind?
Yes, that is correct. Thanks for reformulating the issue, I think it’s good for readers to see multiple perspectives. Issues that are at a purely formal level like this one are difficult to explain.
I like your idea of kinds; it sounds like an attractive alternative to a barebones super universe, if only on a purely philosophical level. As I remarked at the end, Super HoTT isn’t particularly attractive for the ontology of HoTT since the super universe is basically a kludge whose existence is difficult to motivate. It seems that type theory with kinds wouldn’t suffer as much from this even if it turns out to be equivalent for all practical purposes. Do you have a reference so I can see how kinds are handled formally?
Normally for an “a la Tarski” universe, the coercion $\tau$ is required to preserve the type forming operations judgmentally. In other words, if $\pi : \prod_{A : \mathcal{U}} (\tau(A) \to \mathcal{U}) \to \mathcal{U}$ computes codes for dependent products, then we would have $\tau(\pi(A,B)) \equiv \prod_{x:\tau(A)} \tau(B(x))$ judgmentally. This is also not something that can be “packaged up” into a structure living inside of type theory, and I don’t think introducing a super universe fixes that problem. So it seems to me that your “a la Bauer” universes can only be “weakly a la Tarski”, in the sense that $T(\pi(A,B))$ is equivalent (or propositionally equal) to $\prod_{x:T(A)} T(B(x))$. Weakly alaTarski universes are also much easier to obtain in models (though not, as far as I know, if you also want a super universe that is strictly alaTarski) but I think they are noticeably more cumbersome to use.
Thanks Mike, this is very interesting and your comment gave me stuff to think about for a few days… Yes, it is a desirable feature for Tarskistyle coercion to preserve type forming operations judgmentally. Unfortunately, this appears to be somewhat incompatible with the presence of a super universe. I looked for a balance of features here but I couldn’t reach a definite conclusion.
To recap, for my own benefit and that of others: In the Super HoTT approach there is only one true Tarskistyle coercion for the super universe and other universe only have a $\tau:\mathcal{U} \to \mathcal{U}^\ast$ that acts like one but isn’t one. After composing $\tau$ with the true Tarskistyle coercion for the super universe $\mathcal{U}^*$, we can only obtain a weak Tarskistyle universe since, for example, there is no obvious way to enforce judgmental equality between $\tau(\pi(A,B))$ and $\prod_{x:\tau(A)} \tau(B(x))$ (as understood by $\mathcal{U}^*$).
This brings up the issue of what judgmental equality is in universes à la Bauer. I haven’t found a clear choice here. However, whatever it is, it should be required to match between any two cumulative universes (and this should be part of the definition of cumulative in the context of Super HoTT). It’s not clear whether the super universe should be seen as an accumulation of all smaller universes in Super HoTT (which is basically what requiring $\tau:\mathcal{U}\to\mathcal{U}^*$ to preserve type forming operations judgmentally amounts to).
In my initial conception, this was not the case as I envisioned that different universes might interpret natural numbers and other type differently and sometimes in incompatible ways. That way, for example, classical universes and constructive universes could peacefully inhabit the same super universe. The alternative, to formally see the super universe as an accumulation of all universes, would require some additional machinery such as a formal universe operator, as Palmgren describes. While there are advantages to this, additional heavy machinery would ruin the simplicity of Super HoTT and also reduce (or even eliminate) the need for a super universe.
Another possibility to consider is inductiverecursive universes; are you familiar with those? Inductiverecursive definitions allow you to construct a new “universe” inside a given one, containing any specified collection of types and closed under any desired type forming operations, and with judgmental Tarski conversion. And if you allow a naive sort of “higher inductiverecursive definition”, then such a universe can also be univalent (as long as the containing universe is already univalent).
However, we don’t yet know even whether ordinary inductiverecursive definitions are consistent with the homotopy interpretation, let alone “higher” ones.
Yes, this is much closer to what I believe universes in HoTT ought to be. Thanks for the pointer, I will need to look into this…
It would be great if universe formation could fit into some kind of generalization of higher inductive types. This possibility also puts an interesting spin on the ongoing discussion on the formal rules for higher inductive types in HoTT. Such an approach would also be ontologically much more satisfying than ad hoc approaches such as Super HoTT.
(Mike gave me a pointer to this discussion after a post in HoTT mailing list where I was reporting my frustration with the current definition of Universes in HoTT)
That way looks better to me than the current HoTT’s infinite hierarchy of type universes.
I’m just an amateur with fuzzy type theory knowledge, but it seems to me that an ideal solution would be to be able to define Univeral types in a similar way that we can define Universal Turing machines or Universal diophantine equations.
For this you would need to have fully formalized rules for type construction – including rules for induction, and higher inductive types (and certainly also a constructive justification for the Univalent axiom).
Any type being able to encode these rules would be a Universal type.
Then it seems to me that no hiearchy of type Universe would be needed any more. The formal rules of type theory would just define what is a (single) universal type. Any other Universal type or hiearchy of Universal types could then be build within the theory.
If this is correct, the problem with Universe is just a side effect of the current lack of complete formalization of type theory.
I don’t quite understand what you mean, but my best guess is that you’re suggesting something like Francois says that Andrej suggested: define a universe to be a type of (“codes for”) types together with operations exhibiting how all the typeforming operations act on those codes. The problem with that is, as Francois said and I expanded on above, it won’t behave judgmentally the way we want universes to behave.
I have just discoved your post “HoTT should eat itself”. I was thinking at exactly the problem you describe in the post. After reading your article I realize that it is not as easy as I was feeling. And I miss knowledge in model theory to really understand most of the discussion…
However Following your words: “That’s just what we need: more people with more different backgrounds and perspectives working on the problem.”, I will keep going my own way and see if it can lead to anything valuable… Thanks for your always clear answers.
What is “Tarski style coercion”?
Tarskistyle refers to having an explicit decoding function $T$ which takes an element of a universe $\mathcal{U}$ to its type, i.e., with an introduction rule of the form $$\frac{a : \mathcal{U}}{T(a)\quad\text{type}}.$$ In opposition, Russellstyle universes have a rule of the form $$\frac{A : \mathcal{U}}{A\quad\text{type}}$$ which says that elements of the universe type $\mathcal{U}$ are themselves types rather than codes for actual types.
The two approaches differ on how they handle relations between universes. In Russellstyle, one can simply say that one universe is included in another using a rule $$\frac{A : \mathcal{U}}{A : \mathcal{V}}.$$ In Tarskistyle, one needs conversions $t:\mathcal{U} \to \mathcal{V}$ that translate $\mathcal{U}$codes into $\mathcal{V}$codes as well as rules that explain how these conversions fit with the decoding functions as expected.
The issue with Russellstyle is that one gets confusing inclusions, especially with inductive types where one gets noncanonical elements coming from smaller universes. Tarskistyle doesn’t suffer from such confusion since coercions are clearly marked. In practice, Russellstyle is construed as shorthand for Tarskistyle and the conversions are only invoked to resolve confusions. The “Tarskistyle coercions” are all the hidden conversions that happen in this translation.
When I wrote this post two years ago, I seemed to use “coercion” specifically for the decoding function that convert elements of universes to actual types, separate from the machinery to convert between universes. I don’t think this distinction is the typical use; I would probably use the unambiguous “decoding function” if I were to rewrite this today.
What is the “Foundational Thesis of HOTT”?
If I had to guess, it would be the thesis that HOTT can replace ZFC as an adequate “foundation for mathematics”?
Your guess is correct.