This is the fourth entry in my series on type theory. We will introduce a lot of notation that is reminiscent of set theory, to make everyone feel more at home, and then we’ll discuss the axiom of choice.
Subtypes
Our next topic is “subsets” within types. I mentioned in the beginning that we can define “subsets” within types via predicates. This usually works great, but sometimes the “subset” needs to be more type-like and then a predicate won’t do.
As an example, say we have the following predicate on the natural numbers
even(n):≡∃(m:N).(n=2⋅m)
with the type even:N→Ω (where Ω represents all truth values). Then in order to claim that e is an even number, we write even(e). If you were really into using the membership symbol ∈, then you could also write this as “e∈even” and as long as you make clear what is meant by that (namely, even(e)), this is well-defined.
However, we can’t do something like “∀(e:even)” with our predicate, because you absolutely need a type there in the quantifier. So what do we do when we want to quantify over all even numbers?
No problem, we’ll just use the the same trick we used to define the subuniverse TruthVal. With a sum type, we can define a type which basically corresponds to the subset defined by the predicate:
E:≡∑n:Neven(n).
What do the elements of E look like? They are pairs where the first element is a natural number n and the second element is a witness element p for the fact that n is even: (n,p). There can be no (n,⋅) in E where n is not even because then there wouldn’t be a witness. Likewise, as long as you can prove even(n) for some n (with witness p), then (n,p) is clearly an element of the type. So, the elements of the type E aren’t directly the elements of the subset of even numbers, but there is a very obvious mapping between the two (namely just throwing away the witness). In fact, we will often pretend that the elements of E are just the elements of the subset defined by the predicate. And now we can quantify over the even numbers: ∀(e:E).
Types that have been constructed from some base type in the way we constructed E here from N (i.e., with ∑ and a predicate) are called subtypes of the base type. Thus, E is a subtype of N. If the base type is a universe U, we’ll say subuniverse instead of subtype, but the mechanism is exactly the same.
As this is a common construct, there is a special notation for it, but we only use it if the expression after the ∑ symbol is a predicate, i.e., a function that returns a truth value/subsingleton, ϕ:A→Ω:
{x:A|ϕ(x)}:≡∑x:Aϕ(x).
This is of course simply set-builder notation. With this, we could have defined the subtype E as
E:≡{n:N|∃(m:N).n=2⋅m}.
Note that in set theory, the axiom of specification is needed to be able to construct subsets from predicates, but in type theory, we can just do it with sum types, as long as we encode logic the way we did. (Another good reason for propositions-as-types.)
Set operations
At this point in the HoTT book, the authors define set operators, like union ∪ or intersection ∩, on these subtypes, because we want to do all kinds of mathematics and those operators are often very useful. For example, here is how they define inclusion, ⊆:
({x:A|ϕ(x)}⊆{x:A|ψ(x)}):⇔(∀(x:A).ϕ(x)⇒ψ(x))
So, a subtype is included in another subtype if the underlying predicate of the first subtype implies the predicate of the second subtype. This definition works because subtypes are backed by predicate functions and so we can define these set operations in terms of operations on predicates.
However, defining these set operations as acting on subtypes seems to me like trying to be a bit too clever. One problem is the type signature of the operators. We could write it like this:
⊆:∏A:USubtype(A)×Subtype(A)→Ω
but then how to define Subtype(A) – the type of all subtypes on A? I think the definition would be this
Subtype(A)≡{T:U|∃(ϕ:A→Ω).T={a:A|ϕ(a)}}
but this does not look like a nice definition at all. This seems like a hint to me that the subuniverse of subtypes is not a natural concept, and we shouldn’t define functions on this abomination.
The second problem I see is that defining these operations on subtypes risks blurring the distinction between types and objects with set-like semantics. For example, you might accidentally use ⊆ on a type which wasn’t a subtype, because it might sometimes be hard to tell the difference.
For these reasons, I would define set-operators (and relations) like “⊆” on good old predicate functions:
⊆:∏A:U(A→Ω)×(A→Ω)→Ωϕ⊆Aξ:≡∀(a:A).ϕ(a)⇒ξ(a)
In words: in the domain A, property ϕ is a sub-property of ξ if, for all objects in the domain, ξ is true whenever ϕ is true. This definition is concise and easy to read.
The goal is here to make set-like mathematics (which uses relations and operators like ⊆, ∈, ∩, ∪) less cumbersome when done on top of type theory. Working with these predicates in type theory should feel as natural as working with sets in set theory.
So, let’s define more set operators and relations. We already mentioned “∈” above:
∈:∏A:UA×(A→Ω)→Ωa∈Aϕ:≡ϕ(a)
And as a reminder, we can also write “P(A)” (power set type) instead of “A→Ω” to make the type signature even more concise.
The following operators return a new predicate, so we have to use anonymous functions to define them (I skipped the type signatures and type subscripts this time):
ϕ∩ξ:≡λa.ϕ(a)∧ξ(a)ϕ∪ξ:≡λa.ϕ(a)∨ξ(a)A∖ϕ:≡λa.¬ϕ(a)
As is usual with these operators, we’re writing them in infix notation.
But, as nice as working with predicate functions is, if you want to quantify over elements, you still need a (sub-)type! You have two choices here: ∀(a:{a:A|ϕ(a)}).(⋯) or ∀(a:A).ϕ(a)⇒(⋯). Both are formally correct, but we will prefer the latter.
(If I could suggest another notation, it would be this: A|ϕ:≡{a:A|ϕ(a)}, which reduces the notation to only the essential symbols; such that we can write ∀(a:A|ϕ).(⋯), but I’ll try to keep the notation pretty standard here, so I won’t be using this.)
If S:P(P(A)) is a predicate on predicates on A (“a set of subsets”), we can also define arbitrary, non-binary intersection and union on it.[1] For once, I will be pulling out all the stops to make this look like set theory (even though it’s type theory in disguise!):[2]
and don’t panic – the square brackets [⋯] mean exactly the same as round brackets (⋯). It was just getting a bit hard to parse with only round ones.
There are other beloved set theory notations that can be “rescued” by giving them new semantics in type theory. For example, if you have a,b,c:A then you could write {a,b,c}A to refer to the subtype {x:A|(x=a)∨(x=b)∨(x=c)}. But I will stop here now.
To summarize again the difference between subtypes and predicates: if you’re working in a fixed base set (like N or R) and you want to use set-like operations, use predicates; if you are in a context where a type is needed or you only expect to do “type-level operations” (like quantifying over elements), use subtypes. In the following, we will see many examples where subtypes are very useful.
Construction of other numbers
Mathematicians don’t only want to work with boring old natural numbers, so with the power set type and subtypes in hand, we can also construct other numbers, like the integers Z and the rational numbers Q. They can be defined as equivalence classes on binary product types like N×N. I’ll be very brief in this section; it’s mostly just to show that it’s possible.
So, let’s first define equivalence classes. If A is some base type and R is a relation on A with type A×A→Ω, then the equivalence classes of R belong to the following type (which is also called the quotient of A by R) :
A/R:≡{S:P(A)|∃(a:A).∀(b:A).R(a,b)⇔b∈S}.
(I wrote “b∈S” here to mean “S(b)”.) As you can see, this is a subtype of the power set of A. Each element of A/R is one equivalence class, which is a “subset” of A whose members are related according to R.
For the integers Z, we define a relation on the product N×N with the following definition:
diff((a,b),(x,y)):≡(a+y=x+b):(N×N)×(N×N)→Ω
such that
Z:≡(N×N)/diff
where we can choose the canonical forms (n,0) and (0,n) for the equivalence classes, corresponding to +n and −n respectively.
To get from the integers to the rational numbers, we take Z×N and use this relation:
ratio((a,b),(x,y)):≡(a⋅(y+1)=x⋅(b+1)).
The +1 is there to prevent division by zero; we could have also just used the natural numbers without 0 instead. With this we have
Q:≡(Z×N)/ratio.
A pair (a,b):Q corresponds to the fraction a/(b+1). The canonical form of a rational number is a fraction that can’t be reduced any further (“fraction in lowest terms”) – every rational number has a unique representation as an irreducible fraction with a positive denominator.
Constructing the real numbers is a bit more involved but it basically works like it usually does; for example via Dedekind cuts, which we can represent by two predicates on Q. So, let L and U be two predicates (aka “subsets”) on Q. Then if isCut(L,U) is a predicate that tells us whether the pair (L,U) is a Dedekind cut (you can look up the definition on Wikipedia), then the (Dedekind) reals are given by the following:
RDedekind:≡{(L,U):P(Q)×P(Q)|isCut(L,U)}.
The usual operations and relations for the real numbers (e.g., < and ≤) can then be defined on this type.
We can also define restrictions like R>0, the positive real numbers, by defining a subtype with a predicate like λx.(x>0). The very common “∀ε>0” from calculus can then be written as ∀(ε:R>0), but I suppose you could also write ∀(ε>0:R).
The family of finite types
To round out this section on numbers, I’ll quickly comment on a type family that I’ve already briefly mentioned before: the family of finite types. As a reminder, a family of types is a type-valued function from some index set I to some types: I→U. A very simple type family B with the index set 2 would be, for example, B(02):≡R, B(12):≡Q.
The family of finite types is indexed by N and can be defined like this:
Fin:N→UFin(n):≡{m:N|m<n}
The essential point is that Fin(n) has exactly n elements. And if A≃B means that types A and B are equivalent/isomorphic (see the next post for a rigorous definition of that), then of course we’ll have[3]
0≃Fin(0)1≃Fin(1)2≃Fin(2)
The type family Fin(n) is very useful when you want to iterate over a finite index. For example, instead of
∀(i:N).(i<n)⇒(⋯)
you can just write
∀(i:Fin(n)).(⋯)
The family of finite types is generally very handy for indexing things; for example other type families: B:Fin(10)→U. However, finite types can also be used to index vectors! Now, usually, vectors are realized as tuples, like v:R×R×R (or v:R3), but there are some advantages to realizing vectors as “functions” on finite types, v:Fin(3)→R. Vector entries are then accessed by function application, v(0), where 0:Fin(3). This is how the dex programming language formalizes arrays, for example. (Though these “array functions” are memoized for performance.)
The advantage is that it allows you two write functions that work with generic vector sizes while still being type-safe (or rather, shape-safe). For example, a concatenation function concat, which concatenates two vectors of arbitrary length, would have the following type:
concat:(I→R)×(J→R)→((I+J)→R)
where I and J are arbitrary index types and I+J is the binary sum type. If we now pass in two vectors, u:Fin(2)→R and v:Fin(4)→R, then the result has the type (Fin(2)+Fin(4))→R, which isn’t quite Fin(6)→R, but the two types are isomorphic. So, we can read off from the types that the result is a vector of length 6.
More examples of subtypes
When we introduced sum types, we used the type of all groups as an example. The type we gave was
Group≡∑T:U(T×T→T)×T
but we can actually be more precise than that. As this definition is written, an element of Group is just a type with some binary operation and some element, but those do not necessarily satisfy the group axioms! For example, in a group, the binary operation needs to be associative and the special element needs to be a neutral element (with respect to the binary operation), but the type doesn’t guarantee that! To solve this, we can filter for those elements that satisfy the group axioms with set-builder notation:
(As usual, the binary operation is written in infix notation.) Technically, an element of this new Group type will not be the triplet (T,∗,e), but the 4-tuple (T,∗,e,p) where p is the witness for isGroup(T,∗,e), but, again, we’ll usually ignore this.
If you now want to make statements about an arbitrary group, you can just say, “let (G,∗,e):Group” and then derive clever things about G.
As another example of subtypes, consider the type of continuous functions between X and Y. The plain function type is X→Y, but if we have a predicate iscont(f) which tells us whether function f is continuous, then we can narrow down the type:
Cont(X,Y):≡{f:X→Y|iscont(f)}.
Cont is then a type family indexed by X:U and Y:U. Of course, if the definition of “continuous” is clear from the context, then you can just informally write
Cont(X,Y)≡{f:X→Y|f is continuous}
Axiom of choice
As mentioned before, the law of excluded middle (LEM) is actually implied by the (type-theoretic version of the) axiom of choice. So, let’s talk about it!
In set theory, one common way to state the axiom is this:
For any set X of nonempty sets, there exists a choice function f that is defined on X and maps each set S∈X to an element of that set S.
As always, a set is a predicate in our theory. So, for a set of sets, we need multiple predicates – a family of predicates.[4] We can index those predicates with an index type I. If the predicates are on the type A, then the indexed predicates are given by
ϕ:I→A→Ω,
a function ϕ(i,a) with two parameters (we used currying): an index i:I (to index the predicate) and an element a:A. This may look more like a relation than a predicate, but the first element is really only there for indexing; we could also write it as “ϕi(a)” (or even “a∈ϕi”) instead of ϕ(i,a) (but ϕ(i,a) is more standard, so we’ll use that).
So, ϕ plays the role of the set of sets – X in the above definition. Now, how to ensure the sets are nonempty? We will need to assume ∀(i:I).∃(a:A).ϕ(i,a); for all indices i:I, there exists an a:A such that the predicate holds.
Finally, what is the thing we want to assert? The existence of a function f which takes an index i:I and returns an element of A for which the predicate holds:
Translated: “if all the sets in the family ϕ are non-empty, then a function f exists, which, for all sets in the family, picks an element of the set.”
However, we can make this slightly more general by allowing that the sets don’t all need to come from the same base type A. So, one set could be from N and another from R but there should still be a choice function that picks an element from all of these. In order to allow this, we have to replace the fixed type A with a type-valued function A(i) that may return different types based on the value of i. Then we get:[5]
where all the ϕ(i,a)’s are subsingletons. We can see that f needs to belong to a dependent function type in this more general setup.
And that’s the axiom of choice transcribed to type theory! As with the axiom of choice in set theory, for finite index types I this is actually a theorem that can be proved. (Finite types are those types where an equivalence exists between the type and a member of the family of finite types, Fin(n), for some n. We’ll define “equivalence” later on.) The axiom is only needed for infinite I.
So, let’s prove the axiom of choice for the case where I is the unit type 1 (i.e., it’s very much finite), because we can learn something from that. In this case, we don’t really need to index ϕ and so we can just write
(∃(a:A).ϕ(a))⇒(∃(f:1→A).ϕ(f(⋆)))
Now, with “classical” reasoning, this is easy to prove: if there exists an element in A that is in ϕ, then let a be such an element; next, we’ll just define the constant function f(x)≡a; surely we’ll then have ϕ(f(⋆)).
But let’s check how this works with the underlying types:
∥∥
∥∥∑a:Aϕ(a)∥∥
∥∥→∥∥
∥∥∑f:1→Aϕ(f(⋆))∥∥
∥∥.
It kind of looks like we are given the pair (a,p) where a:A and p is the witness for ϕ(a), and then it seems like we can again construct f as the constant function λx.a and use p as the witness for ϕ(f(⋆)).
However, we don’t actually get the pair (a,p); we only get the double negation of it: ∥…∥ and we can’t apply DNE to this because the inner type might not be a subsingleton. So what can we do?
To be clear, the “classical” proof above is correct. But in order to understand why it is correct in the underlying type system, we’ll need to understand more about the operation we wrote as ∥…∥. We implemented this with double negation, but there are actually other ways to do this, and so this operation has a more general name: propositional truncation. It is called that because it “truncates” a type to a subsingleton, which are used for propositions. We will later get to know other truncations as well.
Propositional truncation in detail
The core properties of propositional truncation are these:
If some type A is inhabited, then so is ∥A∥. Furthermore, ∥A∥ is a subsingleton.
If A is a subsingleton, then A is inhabited if ∥A∥ is inhabited.
If P is a subsingleton and we have a function f:A→P, then there exists a function g:∥A∥→P such that g(~a)=f(a) for all a:A (where ~a is the singular element of ∥A∥).
The third property means that if the inhabitedness of A can prove the proposition P, then the inhabitedness of ∥A∥ is already sufficient to prove it. This is very powerful. It means that if you know some ∥A∥ is inhabited, and you want to prove the proposition P (which is a proper subsingleton), then you may just assume you have an element of A and if that’s enough to prove P, then P also follows from just ∥A∥. This is a way to actually prove things from ∥A∥.
So, let’s convince ourselves that double negation (with LEM/DNE as an axiom) actually satisfies these properties. We already showed properties 1 and 2 when we introduced the notation ∥…∥, so we only need to show property 3. This will be the last real proof in this series.
So, say P is a subsingleton and we have an f:A→P. Our goal is to construct g:∥A∥→P. As a reminder:
∥A∥≡(A→0)→0.
We cannot apply DNE to this because A might not be a subsingleton. In order to construct g, we will first construct h:∥A∥→∥P∥, because we can easily get g from this via property 2 (i.e., just apply DNE to ∥P∥). Expanding the propositional truncation, the type of h is:
h:((A→0)→0)→(P→0)→0.
If you look carefully at that type and also consider the type of f, you can come up with the following:
h(ˇa,¯p):≡ˇa(¯p∘f)
where “¯p∘f” is the function composition of ¯p and f, defined like this:
∘:∏X:U∏Y:U∏Z:U(Y→Z)→(X→Y)→(X→Z)f∘g:≡λx.f(g(x))
We then get g by composing h with dne:DNE, the double-negation-elimination function (for the type P):
g:≡dneP∘h≡dneP∘(λˇa.(λ¯p.ˇa(¯p∘f))).
What’s left to show is that ∀(a:A).g(~a)=f(a). But this is trivial because both g(~a) and f(a) are in P and P is a subsingleton where all elements are equal to each other. So, this proves property 3 for the truncation based on double negation.
Axiom of choice for a single set
Now that we have a better understanding of propositional truncation, let’s look again at our very finite (single set only) axiom of choice:
∥∥
∥∥∑a:Aϕ(a)∥∥
∥∥→∥∥
∥∥∑f:1→Aϕ(f(⋆))∥∥
∥∥
According to property 3 of propositional truncation, we may actually prove this instead:
(∑a:Aϕ(a))→∥∥
∥∥∑f:1→Aϕ(f(⋆))∥∥
∥∥
This in turn is implied by
(∑a:Aϕ(a))→∑f:1→Aϕ(f(⋆))
because by property 1 of propositional truncation, A always implies ∥A∥.
So, we may use our naïve proof from above after all: From (a,p) we can construct an element of the codomain like this: (λx.a,p). It’s then left to show that p is actually an element of ϕ(f(⋆)). By the rules of function application, f(⋆) is replaced by a, so ϕ(f(⋆))≡ϕ(a), which concludes the proof.
In order to prove the axiom of choice for other finite sets, we can use induction over N.
I’ll leave it to you to think about why this proof fails for infinite index types.
Proving LEM/DNE from AC
The proof showing that the axiom of choice implies LEM/DNE is quite complicated so we’ll only discuss the requirements here. In addition to the axiom of choice, the proof also requires propositional extensionality and function extensionality which we’ll discuss in the next article.
One tricky aspect is that we of course can’t assume LEM/DNE for the proof (because that’s what we’re trying to prove) so we can’t use double negation to implement the propositional truncation that is necessary to formulate the axiom of choice. One possible solution to this problem is to simply assume a propositional truncation operation with the 3 necessary properties, as an additional part of our type theory, simply as a black-box operation.
Another solution is to use the following for propositional truncation:
∥A∥:≡∏V:Ω((A→V)→V)
however, this only works if the set Ω exists – the set that indexes all truth values/subsingletons. We mentioned before that Ω is guaranteed to exist under LEM+propositional extensionality, but, again, we can’t assume LEM, so we would instead need to explicitly assume the existence of Ω as an axiom for this encoding to work.
As we will later encounter other truncations though, I think it makes most sense to simply assume propositional truncation as an additional element of type theory.
Finally, is there some intuition for why AC (axiom of choice) would imply LEM? Well, you could actually view LEM as a sort of mini-AC. AC allows you to pick an element as long as you can prove that there are elements there (i.e., that the subsets are all non-empty). And similarly, LEM/DNE allows you to pick an element of A as long as you can prove that A→0 is not inhabited. In a way, both allow you to conjure up an element out of nothing, as long as you have sufficient reason to believe that an element exists.
In the next and final installment of this series, we will, at last, find out what the deal is with extensionality. We will also discuss the celebrated univalence axiom.
Note that if S is “empty” (i.e., false everywhere), then the intersection of it will be all of A, which might not be what you want. So make sure S isn’t empty!
Why use Fin(2) when you can just use 2? The problem with the “2” notation is that it stops working for higher numbers, like Fin(100). I think “100” would be very confusing as a type.
Why not a predicate of predicates? Because we later want to allow the predicates to live in different base types, which you can only express with a family. And anyway, you can turn a predicate of predicates Ψ:(A→Ω)→Ω easily into a family of predicates; just use the corresponding subtype, I≡{ϕ:A→Ω|Ψ(ϕ)}, as the index type. The family is then given by the function ξ≡(λ(ϕ,p).ϕ) where p is the witness for Ψ(ϕ).
You could call this the “truncated version” of the axiom of choice in type theory because “∃” gets translated to ∥∑…∥. The “untruncated version” with plain sum types ∑ is a theorem in type theory.
This is the fourth entry in my series on type theory. We will introduce a lot of notation that is reminiscent of set theory, to make everyone feel more at home, and then we’ll discuss the axiom of choice.
Subtypes
Our next topic is “subsets” within types. I mentioned in the beginning that we can define “subsets” within types via predicates. This usually works great, but sometimes the “subset” needs to be more type-like and then a predicate won’t do.
As an example, say we have the following predicate on the natural numbers
even(n):≡∃(m:N).(n=2⋅m)with the type even:N→Ω (where Ω represents all truth values). Then in order to claim that e is an even number, we write even(e). If you were really into using the membership symbol ∈, then you could also write this as “e∈even” and as long as you make clear what is meant by that (namely, even(e)), this is well-defined.
However, we can’t do something like “∀(e:even)” with our predicate, because you absolutely need a type there in the quantifier. So what do we do when we want to quantify over all even numbers?
No problem, we’ll just use the the same trick we used to define the subuniverse TruthVal. With a sum type, we can define a type which basically corresponds to the subset defined by the predicate:
E:≡∑n:Neven(n) .What do the elements of E look like? They are pairs where the first element is a natural number n and the second element is a witness element p for the fact that n is even: (n,p). There can be no (n,⋅) in E where n is not even because then there wouldn’t be a witness. Likewise, as long as you can prove even(n) for some n (with witness p), then (n,p) is clearly an element of the type. So, the elements of the type E aren’t directly the elements of the subset of even numbers, but there is a very obvious mapping between the two (namely just throwing away the witness). In fact, we will often pretend that the elements of E are just the elements of the subset defined by the predicate. And now we can quantify over the even numbers: ∀(e:E).
Types that have been constructed from some base type in the way we constructed E here from N (i.e., with ∑ and a predicate) are called subtypes of the base type. Thus, E is a subtype of N. If the base type is a universe U, we’ll say subuniverse instead of subtype, but the mechanism is exactly the same.
As this is a common construct, there is a special notation for it, but we only use it if the expression after the ∑ symbol is a predicate, i.e., a function that returns a truth value/subsingleton, ϕ:A→Ω:
{x:A | ϕ(x)}:≡∑x:Aϕ(x) .This is of course simply set-builder notation. With this, we could have defined the subtype E as
E:≡{n:N | ∃(m:N).n=2⋅m} .Note that in set theory, the axiom of specification is needed to be able to construct subsets from predicates, but in type theory, we can just do it with sum types, as long as we encode logic the way we did. (Another good reason for propositions-as-types.)
Set operations
At this point in the HoTT book, the authors define set operators, like union ∪ or intersection ∩, on these subtypes, because we want to do all kinds of mathematics and those operators are often very useful. For example, here is how they define inclusion, ⊆:
({x:A|ϕ(x)}⊆{x:A|ψ(x)}):⇔(∀(x:A).ϕ(x)⇒ψ(x))So, a subtype is included in another subtype if the underlying predicate of the first subtype implies the predicate of the second subtype. This definition works because subtypes are backed by predicate functions and so we can define these set operations in terms of operations on predicates.
However, defining these set operations as acting on subtypes seems to me like trying to be a bit too clever. One problem is the type signature of the operators. We could write it like this:
⊆:∏A:USubtype(A)×Subtype(A)→Ωbut then how to define Subtype(A) – the type of all subtypes on A? I think the definition would be this
Subtype(A)≡{T:U|∃(ϕ:A→Ω).T={a:A|ϕ(a)}}but this does not look like a nice definition at all. This seems like a hint to me that the subuniverse of subtypes is not a natural concept, and we shouldn’t define functions on this abomination.
The second problem I see is that defining these operations on subtypes risks blurring the distinction between types and objects with set-like semantics. For example, you might accidentally use ⊆ on a type which wasn’t a subtype, because it might sometimes be hard to tell the difference.
For these reasons, I would define set-operators (and relations) like “⊆” on good old predicate functions:
⊆:∏A:U(A→Ω)×(A→Ω)→Ωϕ⊆Aξ :≡ ∀(a:A).ϕ(a)⇒ξ(a)In words: in the domain A, property ϕ is a sub-property of ξ if, for all objects in the domain, ξ is true whenever ϕ is true. This definition is concise and easy to read.
The goal is here to make set-like mathematics (which uses relations and operators like ⊆, ∈, ∩, ∪) less cumbersome when done on top of type theory. Working with these predicates in type theory should feel as natural as working with sets in set theory.
So, let’s define more set operators and relations. We already mentioned “∈” above:
∈:∏A:UA×(A→Ω)→Ωa∈Aϕ :≡ ϕ(a)And as a reminder, we can also write “P(A)” (power set type) instead of “A→Ω” to make the type signature even more concise.
The following operators return a new predicate, so we have to use anonymous functions to define them (I skipped the type signatures and type subscripts this time):
ϕ∩ξ:≡λa.ϕ(a)∧ξ(a)ϕ∪ξ:≡λa.ϕ(a)∨ξ(a)A∖ϕ:≡λa.¬ϕ(a)As is usual with these operators, we’re writing them in infix notation.
But, as nice as working with predicate functions is, if you want to quantify over elements, you still need a (sub-)type! You have two choices here: ∀(a:{a:A|ϕ(a)}).(⋯) or ∀(a:A).ϕ(a)⇒(⋯). Both are formally correct, but we will prefer the latter.
(If I could suggest another notation, it would be this: A|ϕ:≡{a:A|ϕ(a)}, which reduces the notation to only the essential symbols; such that we can write ∀(a:A|ϕ).(⋯), but I’ll try to keep the notation pretty standard here, so I won’t be using this.)
If S:P(P(A)) is a predicate on predicates on A (“a set of subsets”), we can also define arbitrary, non-binary intersection and union on it.[1] For once, I will be pulling out all the stops to make this look like set theory (even though it’s type theory in disguise!):[2]
⋂S:≡λa.[∀(S:P(A)).S∈S⇒a∈S]⋃S:≡λa.[∃(S:P(A)).S∈S∧a∈S]and don’t panic – the square brackets [⋯] mean exactly the same as round brackets (⋯). It was just getting a bit hard to parse with only round ones.
There are other beloved set theory notations that can be “rescued” by giving them new semantics in type theory. For example, if you have a,b,c:A then you could write {a,b,c}A to refer to the subtype {x:A|(x=a)∨(x=b)∨(x=c)}. But I will stop here now.
To summarize again the difference between subtypes and predicates: if you’re working in a fixed base set (like N or R) and you want to use set-like operations, use predicates; if you are in a context where a type is needed or you only expect to do “type-level operations” (like quantifying over elements), use subtypes. In the following, we will see many examples where subtypes are very useful.
Construction of other numbers
Mathematicians don’t only want to work with boring old natural numbers, so with the power set type and subtypes in hand, we can also construct other numbers, like the integers Z and the rational numbers Q. They can be defined as equivalence classes on binary product types like N×N. I’ll be very brief in this section; it’s mostly just to show that it’s possible.
So, let’s first define equivalence classes. If A is some base type and R is a relation on A with type A×A→Ω, then the equivalence classes of R belong to the following type (which is also called the quotient of A by R) :
A/R:≡{S:P(A)|∃(a:A).∀(b:A).R(a,b)⇔b∈S} .(I wrote “b∈S” here to mean “S(b)”.) As you can see, this is a subtype of the power set of A. Each element of A/R is one equivalence class, which is a “subset” of A whose members are related according to R.
For the integers Z, we define a relation on the product N×N with the following definition:
diff((a,b),(x,y)):≡(a+y=x+b):(N×N)×(N×N)→Ωsuch that
Z:≡(N×N)/diffwhere we can choose the canonical forms (n,0) and (0,n) for the equivalence classes, corresponding to +n and −n respectively.
To get from the integers to the rational numbers, we take Z×N and use this relation:
ratio((a,b),(x,y)):≡(a⋅(y+1)=x⋅(b+1)) .The +1 is there to prevent division by zero; we could have also just used the natural numbers without 0 instead. With this we have
Q:≡(Z×N)/ratio .A pair (a,b):Q corresponds to the fraction a/(b+1). The canonical form of a rational number is a fraction that can’t be reduced any further (“fraction in lowest terms”) – every rational number has a unique representation as an irreducible fraction with a positive denominator.
Constructing the real numbers is a bit more involved but it basically works like it usually does; for example via Dedekind cuts, which we can represent by two predicates on Q. So, let L and U be two predicates (aka “subsets”) on Q. Then if isCut(L,U) is a predicate that tells us whether the pair (L,U) is a Dedekind cut (you can look up the definition on Wikipedia), then the (Dedekind) reals are given by the following:
RDedekind:≡{(L,U):P(Q)×P(Q)|isCut(L,U)} .The usual operations and relations for the real numbers (e.g., < and ≤) can then be defined on this type.
We can also define restrictions like R>0, the positive real numbers, by defining a subtype with a predicate like λx.(x>0). The very common “∀ε>0” from calculus can then be written as ∀(ε:R>0), but I suppose you could also write ∀(ε>0:R).
The family of finite types
To round out this section on numbers, I’ll quickly comment on a type family that I’ve already briefly mentioned before: the family of finite types. As a reminder, a family of types is a type-valued function from some index set I to some types: I→U. A very simple type family B with the index set 2 would be, for example, B(02):≡R, B(12):≡Q.
The family of finite types is indexed by N and can be defined like this:
Fin:N→UFin(n):≡{m:N|m<n}The essential point is that Fin(n) has exactly n elements. And if A≃B means that types A and B are equivalent/isomorphic (see the next post for a rigorous definition of that), then of course we’ll have[3]
0≃Fin(0)1≃Fin(1)2≃Fin(2)The type family Fin(n) is very useful when you want to iterate over a finite index. For example, instead of
∀(i:N).(i<n)⇒(⋯)you can just write
∀(i:Fin(n)).(⋯)The family of finite types is generally very handy for indexing things; for example other type families: B:Fin(10)→U. However, finite types can also be used to index vectors! Now, usually, vectors are realized as tuples, like v:R×R×R (or v:R3), but there are some advantages to realizing vectors as “functions” on finite types, v:Fin(3)→R. Vector entries are then accessed by function application, v(0), where 0:Fin(3). This is how the dex programming language formalizes arrays, for example. (Though these “array functions” are memoized for performance.)
The advantage is that it allows you two write functions that work with generic vector sizes while still being type-safe (or rather, shape-safe). For example, a concatenation function concat, which concatenates two vectors of arbitrary length, would have the following type:
concat:(I→R)×(J→R)→((I+J)→R)where I and J are arbitrary index types and I+J is the binary sum type. If we now pass in two vectors, u:Fin(2)→R and v:Fin(4)→R, then the result has the type (Fin(2)+Fin(4))→R, which isn’t quite Fin(6)→R, but the two types are isomorphic. So, we can read off from the types that the result is a vector of length 6.
More examples of subtypes
When we introduced sum types, we used the type of all groups as an example. The type we gave was
Group ≡ ∑T:U(T×T→T)×Tbut we can actually be more precise than that. As this definition is written, an element of Group is just a type with some binary operation and some element, but those do not necessarily satisfy the group axioms! For example, in a group, the binary operation needs to be associative and the special element needs to be a neutral element (with respect to the binary operation), but the type doesn’t guarantee that! To solve this, we can filter for those elements that satisfy the group axioms with set-builder notation:
Group:≡{(T,∗,e):∑T:U(T×T→T)×T ∣∣ isGroup(T,∗,e)}where
isGroup(T,∗,e):≡assoc(T,∗)∧neutr(T,∗,e)∧inv(T,∗,e)assoc(T,∗):≡∀(a:T).∀(b:T).∀(c:T).((a∗b)∗c=a∗(b∗c))neutr(T,∗,e):≡∀(a:T).((e∗a=a)∧(a∗e=a))inv(T,∗,e):≡∀(a:T).∃(b:T).((a∗b=e)∧(b∗a=e))(As usual, the binary operation is written in infix notation.) Technically, an element of this new Group type will not be the triplet (T,∗,e), but the 4-tuple (T,∗,e,p) where p is the witness for isGroup(T,∗,e), but, again, we’ll usually ignore this.
If you now want to make statements about an arbitrary group, you can just say, “let (G,∗,e):Group” and then derive clever things about G.
As another example of subtypes, consider the type of continuous functions between X and Y. The plain function type is X→Y, but if we have a predicate iscont(f) which tells us whether function f is continuous, then we can narrow down the type:
Cont(X,Y):≡{f:X→Y|iscont(f)} .Cont is then a type family indexed by X:U and Y:U. Of course, if the definition of “continuous” is clear from the context, then you can just informally write
Cont(X,Y)≡{f:X→Y|f is continuous}Axiom of choice
As mentioned before, the law of excluded middle (LEM) is actually implied by the (type-theoretic version of the) axiom of choice. So, let’s talk about it!
In set theory, one common way to state the axiom is this:
As always, a set is a predicate in our theory. So, for a set of sets, we need multiple predicates – a family of predicates.[4] We can index those predicates with an index type I. If the predicates are on the type A, then the indexed predicates are given by
ϕ:I→A→Ω ,a function ϕ(i,a) with two parameters (we used currying): an index i:I (to index the predicate) and an element a:A. This may look more like a relation than a predicate, but the first element is really only there for indexing; we could also write it as “ϕi(a)” (or even “a∈ϕi”) instead of ϕ(i,a) (but ϕ(i,a) is more standard, so we’ll use that).
So, ϕ plays the role of the set of sets – X in the above definition. Now, how to ensure the sets are nonempty? We will need to assume ∀(i:I).∃(a:A).ϕ(i,a); for all indices i:I, there exists an a:A such that the predicate holds.
Finally, what is the thing we want to assert? The existence of a function f which takes an index i:I and returns an element of A for which the predicate holds:
(∀(i:I).∃(a:A).ϕ(i,a))⇒(∃(f:I→A).∀(i:I).ϕ(i,f(i)))Translated: “if all the sets in the family ϕ are non-empty, then a function f exists, which, for all sets in the family, picks an element of the set.”
However, we can make this slightly more general by allowing that the sets don’t all need to come from the same base type A. So, one set could be from N and another from R but there should still be a choice function that picks an element from all of these. In order to allow this, we have to replace the fixed type A with a type-valued function A(i) that may return different types based on the value of i. Then we get:[5]
AC:≡(∀(i:I).∃(a:A(i)).ϕ(i,a))⇒(∃(f:∏i:IA(i)).∀(i:I).ϕ(i,f(i)))where all the ϕ(i,a)’s are subsingletons. We can see that f needs to belong to a dependent function type in this more general setup.
And that’s the axiom of choice transcribed to type theory! As with the axiom of choice in set theory, for finite index types I this is actually a theorem that can be proved. (Finite types are those types where an equivalence exists between the type and a member of the family of finite types, Fin(n), for some n. We’ll define “equivalence” later on.) The axiom is only needed for infinite I.
So, let’s prove the axiom of choice for the case where I is the unit type 1 (i.e., it’s very much finite), because we can learn something from that. In this case, we don’t really need to index ϕ and so we can just write
(∃(a:A).ϕ(a))⇒(∃(f:1→A).ϕ(f(⋆)))Now, with “classical” reasoning, this is easy to prove: if there exists an element in A that is in ϕ, then let a be such an element; next, we’ll just define the constant function f(x)≡a; surely we’ll then have ϕ(f(⋆)).
But let’s check how this works with the underlying types:
∥∥ ∥∥∑a:Aϕ(a)∥∥ ∥∥→∥∥ ∥∥∑f:1→Aϕ(f(⋆))∥∥ ∥∥ .It kind of looks like we are given the pair (a,p) where a:A and p is the witness for ϕ(a), and then it seems like we can again construct f as the constant function λx.a and use p as the witness for ϕ(f(⋆)).
However, we don’t actually get the pair (a,p); we only get the double negation of it: ∥…∥ and we can’t apply DNE to this because the inner type might not be a subsingleton. So what can we do?
To be clear, the “classical” proof above is correct. But in order to understand why it is correct in the underlying type system, we’ll need to understand more about the operation we wrote as ∥…∥. We implemented this with double negation, but there are actually other ways to do this, and so this operation has a more general name: propositional truncation. It is called that because it “truncates” a type to a subsingleton, which are used for propositions. We will later get to know other truncations as well.
Propositional truncation in detail
The core properties of propositional truncation are these:
The third property means that if the inhabitedness of A can prove the proposition P, then the inhabitedness of ∥A∥ is already sufficient to prove it. This is very powerful. It means that if you know some ∥A∥ is inhabited, and you want to prove the proposition P (which is a proper subsingleton), then you may just assume you have an element of A and if that’s enough to prove P, then P also follows from just ∥A∥. This is a way to actually prove things from ∥A∥.
So, let’s convince ourselves that double negation (with LEM/DNE as an axiom) actually satisfies these properties. We already showed properties 1 and 2 when we introduced the notation ∥…∥, so we only need to show property 3. This will be the last real proof in this series.
So, say P is a subsingleton and we have an f:A→P. Our goal is to construct g:∥A∥→P. As a reminder:
∥A∥≡(A→0)→0 .We cannot apply DNE to this because A might not be a subsingleton. In order to construct g, we will first construct h:∥A∥→∥P∥, because we can easily get g from this via property 2 (i.e., just apply DNE to ∥P∥). Expanding the propositional truncation, the type of h is:
h:((A→0)→0)→(P→0)→0 .If you look carefully at that type and also consider the type of f, you can come up with the following:
h(ˇa,¯p):≡ˇa(¯p∘f)where “¯p∘f” is the function composition of ¯p and f, defined like this:
∘:∏X:U∏Y:U∏Z:U(Y→Z)→(X→Y)→(X→Z)f∘g:≡λx.f(g(x))We then get g by composing h with dne:DNE, the double-negation-elimination function (for the type P):
g:≡dneP∘h≡dneP∘(λˇa.(λ¯p.ˇa(¯p∘f))) .What’s left to show is that ∀(a:A).g(~a)=f(a). But this is trivial because both g(~a) and f(a) are in P and P is a subsingleton where all elements are equal to each other. So, this proves property 3 for the truncation based on double negation.
Axiom of choice for a single set
Now that we have a better understanding of propositional truncation, let’s look again at our very finite (single set only) axiom of choice:
∥∥ ∥∥∑a:Aϕ(a)∥∥ ∥∥→∥∥ ∥∥∑f:1→Aϕ(f(⋆))∥∥ ∥∥According to property 3 of propositional truncation, we may actually prove this instead:
(∑a:Aϕ(a))→∥∥ ∥∥∑f:1→Aϕ(f(⋆))∥∥ ∥∥This in turn is implied by
(∑a:Aϕ(a))→∑f:1→Aϕ(f(⋆))because by property 1 of propositional truncation, A always implies ∥A∥.
So, we may use our naïve proof from above after all: From (a,p) we can construct an element of the codomain like this: (λx.a,p). It’s then left to show that p is actually an element of ϕ(f(⋆)). By the rules of function application, f(⋆) is replaced by a, so ϕ(f(⋆))≡ϕ(a), which concludes the proof.
In order to prove the axiom of choice for other finite sets, we can use induction over N.
I’ll leave it to you to think about why this proof fails for infinite index types.
Proving LEM/DNE from AC
The proof showing that the axiom of choice implies LEM/DNE is quite complicated so we’ll only discuss the requirements here. In addition to the axiom of choice, the proof also requires propositional extensionality and function extensionality which we’ll discuss in the next article.
One tricky aspect is that we of course can’t assume LEM/DNE for the proof (because that’s what we’re trying to prove) so we can’t use double negation to implement the propositional truncation that is necessary to formulate the axiom of choice. One possible solution to this problem is to simply assume a propositional truncation operation with the 3 necessary properties, as an additional part of our type theory, simply as a black-box operation.
Another solution is to use the following for propositional truncation:
∥A∥:≡∏V:Ω((A→V)→V)however, this only works if the set Ω exists – the set that indexes all truth values/subsingletons. We mentioned before that Ω is guaranteed to exist under LEM+propositional extensionality, but, again, we can’t assume LEM, so we would instead need to explicitly assume the existence of Ω as an axiom for this encoding to work.
As we will later encounter other truncations though, I think it makes most sense to simply assume propositional truncation as an additional element of type theory.
Finally, is there some intuition for why AC (axiom of choice) would imply LEM? Well, you could actually view LEM as a sort of mini-AC. AC allows you to pick an element as long as you can prove that there are elements there (i.e., that the subsets are all non-empty). And similarly, LEM/DNE allows you to pick an element of A as long as you can prove that A→0 is not inhabited. In a way, both allow you to conjure up an element out of nothing, as long as you have sufficient reason to believe that an element exists.
In the next and final installment of this series, we will, at last, find out what the deal is with extensionality. We will also discuss the celebrated univalence axiom.
We can also define these for an indexed family of predicates, S:I→P(A):
⋃i:ISi≡λa.∃(i:I).a∈Siand similarly for intersection.
Note that if S is “empty” (i.e., false everywhere), then the intersection of it will be all of A, which might not be what you want. So make sure S isn’t empty!
Why use Fin(2) when you can just use 2? The problem with the “2” notation is that it stops working for higher numbers, like Fin(100). I think “100” would be very confusing as a type.
Why not a predicate of predicates? Because we later want to allow the predicates to live in different base types, which you can only express with a family. And anyway, you can turn a predicate of predicates Ψ:(A→Ω)→Ω easily into a family of predicates; just use the corresponding subtype, I≡{ϕ:A→Ω|Ψ(ϕ)}, as the index type. The family is then given by the function ξ≡(λ(ϕ,p).ϕ) where p is the witness for Ψ(ϕ).
You could call this the “truncated version” of the axiom of choice in type theory because “∃” gets translated to ∥∑…∥. The “untruncated version” with plain sum types ∑ is a theorem in type theory.