This is the third part of a series on type theory. This time we’re developing real, classical logic in type theory. However, before we can get there, we have to talk about equality first; specifically, propositional equality, which is different from judgmental equality.
Equality
Up until now, we’ve always used this weird triple equality “≡” when defining stuff. This is judgmental equality, and, like type judgments, it is a “meta” operation; it can’t be used within propositions. The following unfortunately doesn’t make any sense: “(x≡y)⇒(2x≡2y)”, because “≡” doesn’t return a truth value that our logic can do anything with. This is of course a problem.
Another problem is that, by default, judgmental equality is very strict – especially for functions![1] This is why we might want to expand the concept of equality and assume axioms that declare two things to be equal that weren’t equal before, but, again, we can’t do that with “≡”, because we can’t use it in logic statements, so we can’t formulate axioms about it.
The solution is to define a new equality, which will be a relation defined wholly within our logic.
As you’ll recall, predicates are functions from values to truth values (i.e., true and false). Relations are the same, except that they expect more than one argument (usually two). Examples of relations are: less than (<), greater than (>), subset of (⊂), element of (∈).
But the most important relation of all is probably equality. We could write equality in function notation: Eq(x,y), but everyone writes it in infix notation, x=y, and so will we. But keep in mind that it’s actually a type-valued function with two arguments, even if we’re not writing it like that.
If we’re precise, equality is actually a generic function with respect to the type of its arguments. So, it needs to accept three arguments where the first argument is a type:
=:∏T:UT→T→U.
Just from this type we can tell that only elements of the same type can be equal to each other. In infix notation, we indicate the type argument with a subscript: x=Ty. However, we will often drop it if it’s clear from context. (Just like we can drop type annotations in functional programing languages if the compiler can infer them from context!)
Now, which things should be equal in our new equality? Type theorists have condensed the essence of equality and have come up with the following definition: for all elements a of any type A, a is equal to itself. That is, the following type is inhabited:
∏A:U∏a:A(a=a).
In other words: equality is reflexive. We can then take this as the starting point to prove that other things are equal to other things.
(You might wonder whether this definition will really make our new equality behave like equality should behave. To really convince you of that, we would probably need to consider other possible definitions and compare the results. But I won’t do that here, so you’ll have to take it on faith that this is a good definition.)
The big question is now, when to use this equality and when to use the one from before: “≡”? My short advice: whenever possible, use judgmental equality, ≡, otherwise use propositional equality, =. That’s because judgmental equality is a stronger statement. It means the symbols on the two sides of ≡ are completely interchangeable. In fact, “x≡y” directly implies “x=y”. Proof: x=x is true by definition; now replace one of the x’s with y (we can do this because we have x≡y, which implies complete substitutability) now we have x=y.
However, this might not always be true in the other direction. So, whenever we define something, we use judgmental equality because why use a weaker equality when we can use a stronger one?
But we aren’t done yet with the definition of propositional equality. We already said that one essential characteristic of equality is that all values are equal to themselves. Another characteristic is a sort of “induction principle”: if two things are equal, then everything that’s true about one of them should also be true of the other.
More formally: if ϕ(x) is true (where ϕ is some predicate/property with ϕ:A→U) and x=y is true, then ϕ(y) is also true. This is guaranteed by the following type’s inhabitedness:
∏x,y:A(x=Ay)→ϕ(x)→ϕ(y).
(The existence of this function is still part of the definition; it’s not a theorem. It’s just like how the binary product type A×B came with projection functions.) The type implies that if x=y, then y can be substituted for x in any proposition that we can express in our type system. If ϕ(n) stands for the proposition that n is an even number, then if n=m, m is also an even number. Within our logic, we cannot distinguish between two elements that are propositionally equal (though they may still be distinguishable with judgmental equality).
We can use this principle to prove that equality is transitive: define ϕ(x)≡(x=Az) where z is arbitrary, and plug this into the induction principle from above:
∏z:A∏x,y:A(x=Ay)→(x=Az)→(y=Az)
Meaning: if x=y and x=z, then we also have y=z. More informally we could have argued: if x=y and y=z, then replace y with z in the first equality and we get x=z. In a similar manner, we can easily prove that propositional equality is symmetric: start with x=x, which is always true, then given x=y, replace the first x with y.
As another example, say that A and B are any two types, and let’s say that F(A,B) is the relation that holds true iff there is a function from A to B.[2]Then F(A,A) obviously holds for any A because of the identify function idA. Now, if A=B, replace the second A with B. We get: (A=B)→F(A,B). Thus, equality between two types asserts that there must be a function between them (in both directions actually).
So, this is our definition of (propositional) equality. It mostly behaves like you would expect it to: if two things are equal, they can’t be distinguished. However, this equality is a bit too strict for functions, which is why we’ll introduce the principle of function extensionality later on, but for now let’s just accept it as it is.
Subsingletons for truth values
As a reminder, our goal is now to assume DNE/LEM as an axiom in order to get classical logic:
LEM≡∏A:UA+(A→0).
However, for reasons that I myself don’t fully understand, we’ll get big problems if we just assume DNE/LEM for all types. But luckily, there is no problem if we assume DNE/LEM only for types that have either zero or exactly one element. This includes 0 and 1 but also, for example, the type 1→1 which contains one function: λx.x, and the type 1×1 which contains a single pair: (⋆,⋆). And in terms of types with zero elements, there is 0×0 but also, for example, 0×1.
However, if we only assume DNE/LEM for these types, then all of our logic must only produce types with either zero or exactly one element, otherwise we will get propositions for which DNE/LEM doesn’t hold! This means we’ll have to check whether we can make our predicates and logic operators work with this restricted collection of types. That’s what we’ll do in a second, but let’s take another look at the big picture first.
With hindsight, we can see that when we went from “a proposition-as-type is true if it is the type 1” to “a proposition-as-type is true if it is inhabited at all” in the previous article, we went too far. In order to get to classical logic, we should have said “a proposition-as-type is true if it is basically like 1, i.e., it contains exactly one element”. Indeed, we will see that this new idea will work out splendidly.
Types with either zero or exactly one element are sometimes called (−1)-types, but this is a terrible name, so we’ll call them subsingletons instead, which is actually a term from set theory, but it describes the concept well. (The name comes from the fact that sets with one element are called singletons, but we’re also allowing sets with fewer elements than that, so it’s a sub-singleton.) If we ignore for the moment the question of whether the operator ∏ will work correctly with propositions-as-subsingleton-types, we can define a predicate that tells us whether a given type A is a subsingleton, based on the definition of equality that we conveniently just learned about:
isTruthVal:U→UisTruthVal(A):≡∏x:A∏y:A(x=Ay).
Meaning that isTruthVal(A) is inhabited if all elements of A are equal to each other. If A is the empty type, this is vacuously true, i.e., Πx:AΠy:A(x=y) will just be our good friend, the empty function. If A contains just one element, a, then isTruthVal(A) is of course inhabited as well; namely by an element of the type “a=a” which, by the definition of equality, is inhabited for any a. We will take the above definition as the official definition of a subsingleton: that all its elements are equal to each other.[3]
I called the above predicate “isTruthVal” because we will be using subsingletons to represent truth values, as mentioned before. Classical logic is a two-valued logic, which means the truth values are true (also denoted by ⊤), and false (denoted ⊥). And so we use subsingletons to encode these truth values: if the type has no elements, it corresponds to ⊥, and if it has a single element, it corresponds to ⊤. Note in particular that any type with exactly one element can represent ⊤ and any type with zero elements can represent ⊥.
But now we have to address the important question of whether we can still use ∏, ∑, ×, + and “→0” to encode ∀, ∃, ∧, ∨ and ¬, when working with propositions as subsingleton types (so that we can safely apply DNE/LEM to all propositions). We’ll need to make sure that the operations we use will result in subsingleton types if they’re getting subsingleton types as inputs.
Operations on subsingletons
Say that P and Q are subsingleton types. Is P→Q one as well then? Well, it kind of makes sense that there is only one way to map one element to one element. So we’ll assume for the moment that this checks out if both P and Q have a single element (but we’ll revisit this question later when we talk about function extensionality). Conversely, if P is 0, then P→Q can only contain the empty function which is also unique. And if Q is 0 while P is not, then P→Q has no elements at all, which is also fine for a subsingleton type.
How about P×Q? Here, we can also be certain that there can be at most one element; namely (p,q) if both types have an element: p:P, q:Q, and zero elements otherwise.
Now, let’s say we have a predicate, ϕ, such that ϕ(x) is a subsingleton type for any x:X. Then is
∏x:Xϕ(x)
a subsingleton as well? An element of this type can only exist if there is anything to map the x’s to. So, it requires that all the ϕ(x) are inhabited types (and by assumption, they can at most have a single element). Furthermore, it makes intuitive sense that there is only one way to map a given x to a single value, which allows us to conclude that if all the ϕ(x)’s are singletons, then Πx:Xϕ(x) is a singleton as well. As I keep mentioning, we will revisit equality of functions later and make it rigorous, but for now let’s just use the intuitive notion where two functions are equal if all their outputs are equal, so:
∏x:X(g(x)=f(x))
And this is easy to prove here for two functions g,f:Πx:Xϕ(x). We’ll just invoke the fact that for all p,q:ϕ(x), we have p=q (by the assumption that ϕ(x) is a singleton). Thus, g(x)=f(x) for all x.
We now turn our attention to P+Q and it’s pretty clear that there is a problem here. If both P and Q have an element (say, p and q respectively), then P+Q will have two elements: left(p) and right(q). Similarly, if for ϕ(x) there is more than one x for which ϕ(x) has an element (say, y and z:X and ϕ(y)≡ϕ(z)≡1), then ∑x:Xϕ(x) will have more than one element; in this case (y,⋆) and (z,⋆).
But, as I said before, this is good news! We don’t need ∨ and ∃ anyway, because we already have ∧, ∀ and ¬.
The last one to investigate is negation, ¬P, which we encoded as P→0. This is entirely unproblematic because the type P→0 is a subsingleton for any type P, not just subsingletons. This is because the only element this type can ever contain is the empty function. If we again use the intuitive notion of function equality, then it’s easy to show that for g,f:P→0, we have
∏p:P(g(p)=f(p))
if we already know that 0 is a subsingleton. Do we know that 0 is a subsingleton? Well, on the one hand, there are no elements, so we can make the vacuously-true argument mentioned before. On the other hand, if we assume we have two elements, x,y:0, then we’re again in the position to prove anything, including x=y, via the principle of explosion. So either way, we may conclude that 0 is a subsingleton.
By the way, the fact that that the type A→0 is a subsingleton for any type A is somewhat intriguing… let’s put a pin in that; it might become relevant later!
And that’s all we need in order to encode classical logic in type theory:
First-order Logic
Type Theory
∀(x:X).ϕ(x)
∏x:Xϕ(x)
P∧Q
P×Q
¬P
P→0
⊤ (true)
1
⊥ (false)
0
Additionally, we’ll assume the following axiom:
DNE−1:≡∏P:U(isTruthVal(P)→((P→0)→0)→P)
That is, if P is a proposition (actually, a subsingleton) and we have a double-negated version of P, then we may conclude P. As you can see, DNE−1 as defined above is a type; assuming DNE−1 as an axiom means assuming it has exactly one element. We can call that element dne (it’s sometimes useful for proofs). The subscript “−1” is there to distinguish it from the more general DNE that holds for all types; the “−1” stems from the other name for subsingletons: (−1)-types.
Encoding implication
Previously, we’ve used the function type P→Q to encode P⇒Q (“P implies Q”), but in classical logic, implication is usually not a fundamental operator; rather, it’s defined via “¬” and “∨”, or, if you only have “¬” and “∧”, you can define “P⇒Q” as ¬(P∧¬Q), which is “(P×(Q→0))→0” when encoded as a type. However, it turns out that the type “P→Q” is logically equivalent to this.
(“Logically equivalent” here means that one type is inhabited if and only if the other type is inhabited.)
Let’s quickly prove this! We will again explicitly construct the witnesses.
Say, we have a function f:P→Q, and the function g that we want to construct takes a pair (p,¯q):P×(Q→0) as input, then we can define:
g:(P×(Q→0))→0g((p,¯q)):≡¯q(f(p))
This proves one direction of the equivalence because we were able to construct g from f.
For the opposite direction, we are given a function g of the above type, and and want to construct a function f which takes a p:P as input. First, we’ll construct a helper function ˇq (which will be used inside f, so we can assume we have p:P):
ˇq:(Q→0)→0ˇq(¯q):≡g((p,¯q))
where ¯q is of type Q→0 as before. Now, through double-negation elimination, we can turn ˇq into an element of Q, which is exactly what we wanted to construct. If we use “dne” to denote the function that inhabits DNE, then we can define f:P→Q as
f(p):≡dneQ(λ¯q.g((p,¯q)))
Thus, thanks to DNE, we were able to construct f from g. So, “P→Q” and “(P×(Q→0))→0” are logically equivalent under DNE and we can safely encode implication as P→Q.
Encoding disjunction
Another somewhat interesting question is what the types look like that you get when negating ∏ and × in order to encode ∃ and ∨. That is, we know that we can get P∨Q from ¬(¬P∧¬Q) in classical logic, but what does that look like as types?
Well, it turns out that if we use × and “→0” to encode P∨Q, then the result is logically equivalent to (P+Q→0)→0, i.e., double-negated P+Q. We will prove that in a second, but let me first address a concern you might have here: if we have double-negated P+Q, doesn’t that mean we get P+Q by our DNE axiom? No! Because P+Q is not (in general) a subsingleton, so our DNE−1 does not apply.
OK, let’s prove that
((P→0)×(Q→0))→0
(i.e., ¬(¬P∧¬Q)) is logically equivalent to
(P+Q→0)→0
which would mean that we can encode the logical statement “P∨Q” as (P+Q→0)→0. First we’ll prove the direction top to bottom. So, we are given f:((P→0)×(Q→0))→0 and the function h that we have to construct, takes g:(P+Q)→0 as the first argument. The solution is then:
For the opposite direction, we are given h:(P+Q→0)→0, and the function f that we want to construct takes the pair (¯p,¯q):(P→0)×(Q→0) as an argument. We’ll again need a helper function k:P+Q→0 which is defined within f so we have access to ¯p and ¯q. k is defined by case analysis (aka pattern matching):
k(left(p)):≡¯p(p)k(right(q)):≡¯q(q)
With this, we can construct f:
f((¯p,¯q)):≡h(k)
where k depends on (¯p,¯q) but I’m using loose notation so it’s not indicated.
So, we have proved that ((P→0)×(Q→0))→0 and (P+Q→0)→0 are logically equivalent, and we didn’t even need DNE/LEM for it; this is just always true in type theory. We noted before that A→0 is a subsingleton for any type A, so one way to interpret this result is to say that the double negation turns a non-subsingleton type into a subsingleton while preserving the semantics of it. In other words: P+Q had the problem that it might contain two elements, but (P+Q→0)→0 doesn’t have that problem but still sort of means the same thing (at least if we have DNE; though again, we can’t actually apply DNE here). So it seems double negation might be a general strategy for turning arbitrary types into subsingletons. This is in fact such a useful trick that it has its own notation:
∥A∥≡(A→0)→0
We always have A→∥A∥ for any type A, but we only have ∥A∥→A if A is a subsingleton (provided we have assumed DNE−1). That is, ∥A∥ and A are equivalent for subsingletons in the presence of DNE.
We can show something similar for Σ. We know that ∃(x:X).ϕ(x) is equivalent to ¬∀(x:X).¬ϕ(x) in classical logic. This identity can be used to derive that
∥∥
∥∥∑x:Xϕ(x)∥∥
∥∥
can be used to encode ∃(x:X).ϕ(x). But this derivation is left as an exercise to the reader.
This means we can extend our translation table:
First-order Logic
Type Theory
P⇒Q
P→Q
P⇔Q
(P→Q)×(Q→P)
∃(x:X).ϕ(x)
∥∑x:Xϕ(x)∥
P∨Q
∥P+Q∥
This is now the final scheme; this is the correct way to encode classical first-order logic in type theory. It is fully compatible with DNE/LEM.
As a final note on this: if you look back at our proof that DNE implies LEM, you can see that we applied DNE to the following type:
((B+(B→0))→0)→0
which you may think wasn’t allowed after all, because a union type is not a subsingleton. However, in this case, the inner union type is a subsingleton because the two halves of the union are mutually exclusive (at least if our whole theory is consistent), so this was valid. In general, ∥P+Q∥ is equivalent to P+Q if P and Q are mutually exclusive.
At this point, we should verify that with the general rules of type theory, this encoding of first-order logic actually has all the rules of inference, like modus ponens, existential generalization and conjunction introduction. I won’t go through them here to prove them all, but if you’re interested, see this page.
The subuniverse of truth values
Now that we have set up logic, we will use the familiar logic symbols instead of Π, Σ, etc. So, when constructing propositions and functions that output truth values (like predicates and relations), we will use ∃, ∀, ∨, ∧, ¬, ⇒ and ⇔. The tables above define how these symbols get translated to the underlying operators on types. We can, for example, express LEM now as
LEM≡∀(P:U).P∨¬P.
However, this is the version of LEM that applies to all types and not just subsingletons. The actual version we use is this
LEM−1≡∀(P:U).isTruthVal(P)⇒(P∨¬P)
but wouldn’t it be nicer if we could quantify over the “subuniverse” of all subsingletons directly? This would be especially useful for specifying the type of predicate functions.
Until now we’ve always written the type of a predicate on X as X→U, where U is a universe of types. However, the codomain is not actually all types in the universe; just the subsingletons.
We can define the subuniverse of all subsingletons through the following trick:
TruthValU:≡∑A:UisTruthVal(A).
Let’s think about what that means. Elements of this sum type are pairs where the first element is some type A and the second element is a witness p to the proposition isTruthVal(A). So, such a pair can only exist if the type representing the proposition is inhabited, meaning the proposition is true. Thus, the elements of TruthValU are exactly those types A:U which are subsingletons!
Technically, the elements of TruthValU are pairs of subsingletons and witnesses, (A,p), but we will usually pretend the elements are just the types. In any case, there is an obvious mapping from (A,p) to A.
With this definition, the type of a predicate function on a base type A is ϕ:A→TruthValU. I called this subuniverse “TruthVal” because it encompasses all truth values; that is, it encompasses all types that can encode falseness and truthiness.
However, with our current axioms, we can only define the type of all subsingletons for a particular universe U. That is, TruthValU doesn’t give us all the “falsy” and “truthy” types out there but only the ones in universe U. And this is a problem if we want to talk about power sets.
Power sets and propositional extensionality
The power set type P(A) is of course the type of all “subsets” of some type A. In our theory, subsets are actually predicates – functions from A to truth values – so the power set type is the function type A→TruthValU. And here we see the problem. If this is supposed to be a real power set, then TruthValU better contain all “falsy” and “truthy” types. But it actually only contains those from universe U!
If there were some “truthy” type T (i.e., a type with exactly one element) that’s not in TruthValU but we have a function ϕ that maps some a:A to this type, ϕ(a)≡T, then this function ϕ wouldn’t be captured by the function type A→TruthValU, but it would still be a predicate, so “A→TruthValU” wouldn’t actually encompass all predicates.
So, we need some way to get the type of all the possible truth values.
One radical solution to this problem is drastically reduce the space of truth values. After all, what we want in the end is a two-valued logic, so do we really need a distinction beyond “has no elements” and “has exactly one element”? What TruthVal contains is the empty type 0 and the unit type 1 and then an infinite number of types that are combinations of these: 1×1, 0×1, 0→1, 1×1×0, … But the distinction between these types (beyond whether they’re “truthy” or “falsy”) carries basically no information. So, we might say, “let’s just declare all truthy types to be equal to each other and the same for all falsy types and be done with it”. This would correspond to the principle of propositional extensionality, and would be a solution to our problem, because then we knew that any subsingleton was either equal to 0 or 1 (which are both present in all universes).
We’ll talk about more propositional extensionality when talking about extensionality in general. But for now, let’s assume something a bit weaker (in order to be able to define power sets): There exists a type Ω:U together with an Ω-indexed family of subsingletons (aka, a function f:Πω:ΩP(ω) where all f(ω) are subsingletons) such that this family (the range of f) contains all subsingletons up to equality. Thus, the elements of Ω constitute an index for all falsy and truthy types. And as it is easy to turn an element of Ω into a truth value (via the function f), we can define the power set type to be
P(A):≡(A→Ω)
and this will be guaranteed to encompass all predicates on A because Ω indexes all “falsy” and “truthy” types.
The good news is that the existence of Ω follows from LEM and propositional extensionality (mentioned above). So, we won’t actually have to assume a new axiom for Ω if we adopt propositional extensionality (which I think we do want to but more on that later). A possible (and obvious) choice for Ω under LEM and propositional extensionality is in fact 2 – the type that has exactly two elements.
This is kind of funny because that means we have come full circle: in the beginning we decided against using 2 to represent truth values – instead using types that are either inhabited or not – but now we find out that what we ended up constructing is equivalent to using 2 for representing truth values (assuming LEM and propositional extensionality). Still, the concept of propositions-as-types allowed us to reuse all the type operators like ∏ and × to construct first-order logic, so we didn’t have to separately define all these operators for our encoding of logic.
With propositional extensionality, we may thus define the power set type like this instead:
P(A):≡(A→2)
which looks quite similar to how it works in set theory.
Now, you might wonder whether we need a new type forming rule/axiom for power sets analogous to the axiom of power set in ZF set theory. But we don’t, because our theory already has a type forming rule for function types and as we saw, we can express the power set type as a function type. It’s actually also true in set theory that you can get the existence of power sets from the existence of function sets (set of all functions from X to Y), but it’s usually done the other way around when axiomatizing set theory.
(Another thing that worried me about this definition of power sets is that in practice I can only access those elements of P(A) for which I can formulate a predicate function in the lambda calculus formalism, which surely doesn’t cover all possible mappings from A to 2, if A is infinite. However, the same problem actually also exists in set theory: in practice I can only access those elements of P(A) for which I have a first-order logic formula proving that they are subsets of A. Still, the meaning we give to P(A) in set theory is that it really does contain all subsets, even if we can’t access all of them. We declare the same thing for function types in type theory.)
With the type Ω, our version of logic actually looks more like second-order logic than first-order, because we can quantify over predicates too:
∀(ϕ:A→Ω).Φ(ϕ)and∃(ϕ:A→Ω).Ψ(ϕ).
In fact, we could also quantify over predicates of predicates (i.e., ϕ′:(A→Ω)→Ω) and so on; meaning it’s actually a higher-order logic. (Though, higher-order logic is of course not any more powerful than second-order logic.) One peculiar aspect of type theory is, however, that we don’t just have a single domain of discourse; instead, we have a lot of different types, and the quantifiers ∀ and ∃ only range over one type at a time. This is similar to many-sorted logics where quantifiers always range over only one “sort”. But I don’t know exactly how to compare the “strengths” of these formal systems. I assume they’re all about equally powerful? Type theory is in any case certainly sufficiently expressive to be affected by Gödel’s incompleteness theorems.
This concludes the third post in this series. Next time, we’ll talk about how to do set-like mathematics in type theory and we’ll discuss the axiom of choice.
This is the third part of a series on type theory. This time we’re developing real, classical logic in type theory. However, before we can get there, we have to talk about equality first; specifically, propositional equality, which is different from judgmental equality.
Equality
Up until now, we’ve always used this weird triple equality “≡” when defining stuff. This is judgmental equality, and, like type judgments, it is a “meta” operation; it can’t be used within propositions. The following unfortunately doesn’t make any sense: “(x≡y)⇒(2x≡2y)”, because “≡” doesn’t return a truth value that our logic can do anything with. This is of course a problem.
Another problem is that, by default, judgmental equality is very strict – especially for functions![1] This is why we might want to expand the concept of equality and assume axioms that declare two things to be equal that weren’t equal before, but, again, we can’t do that with “≡”, because we can’t use it in logic statements, so we can’t formulate axioms about it.
The solution is to define a new equality, which will be a relation defined wholly within our logic.
As you’ll recall, predicates are functions from values to truth values (i.e., true and false). Relations are the same, except that they expect more than one argument (usually two). Examples of relations are: less than (<), greater than (>), subset of (⊂), element of (∈).
But the most important relation of all is probably equality. We could write equality in function notation: Eq(x,y), but everyone writes it in infix notation, x=y, and so will we. But keep in mind that it’s actually a type-valued function with two arguments, even if we’re not writing it like that.
If we’re precise, equality is actually a generic function with respect to the type of its arguments. So, it needs to accept three arguments where the first argument is a type:
=:∏T:UT→T→U .Just from this type we can tell that only elements of the same type can be equal to each other. In infix notation, we indicate the type argument with a subscript: x=Ty. However, we will often drop it if it’s clear from context. (Just like we can drop type annotations in functional programing languages if the compiler can infer them from context!)
Now, which things should be equal in our new equality? Type theorists have condensed the essence of equality and have come up with the following definition: for all elements a of any type A, a is equal to itself. That is, the following type is inhabited:
∏A:U∏a:A(a=a) .In other words: equality is reflexive. We can then take this as the starting point to prove that other things are equal to other things.
(You might wonder whether this definition will really make our new equality behave like equality should behave. To really convince you of that, we would probably need to consider other possible definitions and compare the results. But I won’t do that here, so you’ll have to take it on faith that this is a good definition.)
The big question is now, when to use this equality and when to use the one from before: “≡”? My short advice: whenever possible, use judgmental equality, ≡, otherwise use propositional equality, =. That’s because judgmental equality is a stronger statement. It means the symbols on the two sides of ≡ are completely interchangeable. In fact, “x≡y” directly implies “x=y”. Proof: x=x is true by definition; now replace one of the x’s with y (we can do this because we have x≡y, which implies complete substitutability) now we have x=y.
However, this might not always be true in the other direction. So, whenever we define something, we use judgmental equality because why use a weaker equality when we can use a stronger one?
But we aren’t done yet with the definition of propositional equality. We already said that one essential characteristic of equality is that all values are equal to themselves. Another characteristic is a sort of “induction principle”: if two things are equal, then everything that’s true about one of them should also be true of the other.
More formally: if ϕ(x) is true (where ϕ is some predicate/property with ϕ:A→U) and x=y is true, then ϕ(y) is also true. This is guaranteed by the following type’s inhabitedness:
∏x,y:A(x=Ay)→ϕ(x)→ϕ(y) .(The existence of this function is still part of the definition; it’s not a theorem. It’s just like how the binary product type A×B came with projection functions.) The type implies that if x=y, then y can be substituted for x in any proposition that we can express in our type system. If ϕ(n) stands for the proposition that n is an even number, then if n=m, m is also an even number. Within our logic, we cannot distinguish between two elements that are propositionally equal (though they may still be distinguishable with judgmental equality).
We can use this principle to prove that equality is transitive: define ϕ(x)≡(x=Az) where z is arbitrary, and plug this into the induction principle from above:
∏z:A∏x,y:A(x=Ay)→(x=Az)→(y=Az)Meaning: if x=y and x=z, then we also have y=z. More informally we could have argued: if x=y and y=z, then replace y with z in the first equality and we get x=z. In a similar manner, we can easily prove that propositional equality is symmetric: start with x=x, which is always true, then given x=y, replace the first x with y.
As another example, say that A and B are any two types, and let’s say that F(A,B) is the relation that holds true iff there is a function from A to B.[2] Then F(A,A) obviously holds for any A because of the identify function idA. Now, if A=B, replace the second A with B. We get: (A=B)→F(A,B). Thus, equality between two types asserts that there must be a function between them (in both directions actually).
So, this is our definition of (propositional) equality. It mostly behaves like you would expect it to: if two things are equal, they can’t be distinguished. However, this equality is a bit too strict for functions, which is why we’ll introduce the principle of function extensionality later on, but for now let’s just accept it as it is.
Subsingletons for truth values
As a reminder, our goal is now to assume DNE/LEM as an axiom in order to get classical logic:
LEM≡∏A:UA+(A→0) .However, for reasons that I myself don’t fully understand, we’ll get big problems if we just assume DNE/LEM for all types. But luckily, there is no problem if we assume DNE/LEM only for types that have either zero or exactly one element. This includes 0 and 1 but also, for example, the type 1→1 which contains one function: λx.x, and the type 1×1 which contains a single pair: (⋆,⋆). And in terms of types with zero elements, there is 0×0 but also, for example, 0×1.
However, if we only assume DNE/LEM for these types, then all of our logic must only produce types with either zero or exactly one element, otherwise we will get propositions for which DNE/LEM doesn’t hold! This means we’ll have to check whether we can make our predicates and logic operators work with this restricted collection of types. That’s what we’ll do in a second, but let’s take another look at the big picture first.
With hindsight, we can see that when we went from “a proposition-as-type is true if it is the type 1” to “a proposition-as-type is true if it is inhabited at all” in the previous article, we went too far. In order to get to classical logic, we should have said “a proposition-as-type is true if it is basically like 1, i.e., it contains exactly one element”. Indeed, we will see that this new idea will work out splendidly.
Types with either zero or exactly one element are sometimes called (−1)-types, but this is a terrible name, so we’ll call them subsingletons instead, which is actually a term from set theory, but it describes the concept well. (The name comes from the fact that sets with one element are called singletons, but we’re also allowing sets with fewer elements than that, so it’s a sub-singleton.) If we ignore for the moment the question of whether the operator ∏ will work correctly with propositions-as-subsingleton-types, we can define a predicate that tells us whether a given type A is a subsingleton, based on the definition of equality that we conveniently just learned about:
isTruthVal:U→UisTruthVal(A):≡∏x:A∏y:A(x=Ay) .Meaning that isTruthVal(A) is inhabited if all elements of A are equal to each other. If A is the empty type, this is vacuously true, i.e., Πx:AΠy:A(x=y) will just be our good friend, the empty function. If A contains just one element, a, then isTruthVal(A) is of course inhabited as well; namely by an element of the type “a=a” which, by the definition of equality, is inhabited for any a. We will take the above definition as the official definition of a subsingleton: that all its elements are equal to each other.[3]
I called the above predicate “isTruthVal” because we will be using subsingletons to represent truth values, as mentioned before. Classical logic is a two-valued logic, which means the truth values are true (also denoted by ⊤), and false (denoted ⊥). And so we use subsingletons to encode these truth values: if the type has no elements, it corresponds to ⊥, and if it has a single element, it corresponds to ⊤. Note in particular that any type with exactly one element can represent ⊤ and any type with zero elements can represent ⊥.
But now we have to address the important question of whether we can still use ∏, ∑, ×, + and “→0” to encode ∀, ∃, ∧, ∨ and ¬, when working with propositions as subsingleton types (so that we can safely apply DNE/LEM to all propositions). We’ll need to make sure that the operations we use will result in subsingleton types if they’re getting subsingleton types as inputs.
Operations on subsingletons
Say that P and Q are subsingleton types. Is P→Q one as well then? Well, it kind of makes sense that there is only one way to map one element to one element. So we’ll assume for the moment that this checks out if both P and Q have a single element (but we’ll revisit this question later when we talk about function extensionality). Conversely, if P is 0, then P→Q can only contain the empty function which is also unique. And if Q is 0 while P is not, then P→Q has no elements at all, which is also fine for a subsingleton type.
How about P×Q? Here, we can also be certain that there can be at most one element; namely (p,q) if both types have an element: p:P, q:Q, and zero elements otherwise.
Now, let’s say we have a predicate, ϕ, such that ϕ(x) is a subsingleton type for any x:X. Then is
∏x:Xϕ(x)a subsingleton as well? An element of this type can only exist if there is anything to map the x’s to. So, it requires that all the ϕ(x) are inhabited types (and by assumption, they can at most have a single element). Furthermore, it makes intuitive sense that there is only one way to map a given x to a single value, which allows us to conclude that if all the ϕ(x)’s are singletons, then Πx:Xϕ(x) is a singleton as well. As I keep mentioning, we will revisit equality of functions later and make it rigorous, but for now let’s just use the intuitive notion where two functions are equal if all their outputs are equal, so:
∏x:X(g(x)=f(x))And this is easy to prove here for two functions g,f:Πx:Xϕ(x). We’ll just invoke the fact that for all p,q:ϕ(x), we have p=q (by the assumption that ϕ(x) is a singleton). Thus, g(x)=f(x) for all x.
We now turn our attention to P+Q and it’s pretty clear that there is a problem here. If both P and Q have an element (say, p and q respectively), then P+Q will have two elements: left(p) and right(q). Similarly, if for ϕ(x) there is more than one x for which ϕ(x) has an element (say, y and z:X and ϕ(y)≡ϕ(z)≡1), then ∑x:Xϕ(x) will have more than one element; in this case (y,⋆) and (z,⋆).
But, as I said before, this is good news! We don’t need ∨ and ∃ anyway, because we already have ∧, ∀ and ¬.
The last one to investigate is negation, ¬P, which we encoded as P→0. This is entirely unproblematic because the type P→0 is a subsingleton for any type P, not just subsingletons. This is because the only element this type can ever contain is the empty function. If we again use the intuitive notion of function equality, then it’s easy to show that for g,f:P→0, we have
∏p:P(g(p)=f(p))if we already know that 0 is a subsingleton. Do we know that 0 is a subsingleton? Well, on the one hand, there are no elements, so we can make the vacuously-true argument mentioned before. On the other hand, if we assume we have two elements, x,y:0, then we’re again in the position to prove anything, including x=y, via the principle of explosion. So either way, we may conclude that 0 is a subsingleton.
By the way, the fact that that the type A→0 is a subsingleton for any type A is somewhat intriguing… let’s put a pin in that; it might become relevant later!
And that’s all we need in order to encode classical logic in type theory:
Additionally, we’ll assume the following axiom:
DNE−1:≡∏P:U(isTruthVal(P)→((P→0)→0)→P)That is, if P is a proposition (actually, a subsingleton) and we have a double-negated version of P, then we may conclude P. As you can see, DNE−1 as defined above is a type; assuming DNE−1 as an axiom means assuming it has exactly one element. We can call that element dne (it’s sometimes useful for proofs). The subscript “−1” is there to distinguish it from the more general DNE that holds for all types; the “−1” stems from the other name for subsingletons: (−1)-types.
Encoding implication
Previously, we’ve used the function type P→Q to encode P⇒Q (“P implies Q”), but in classical logic, implication is usually not a fundamental operator; rather, it’s defined via “¬” and “∨”, or, if you only have “¬” and “∧”, you can define “P⇒Q” as ¬(P∧¬Q), which is “(P×(Q→0))→0” when encoded as a type. However, it turns out that the type “P→Q” is logically equivalent to this.
(“Logically equivalent” here means that one type is inhabited if and only if the other type is inhabited.)
Let’s quickly prove this! We will again explicitly construct the witnesses.
Say, we have a function f:P→Q, and the function g that we want to construct takes a pair (p,¯q):P×(Q→0) as input, then we can define:
g:(P×(Q→0))→0g((p,¯q)):≡¯q(f(p))This proves one direction of the equivalence because we were able to construct g from f.
For the opposite direction, we are given a function g of the above type, and and want to construct a function f which takes a p:P as input. First, we’ll construct a helper function ˇq (which will be used inside f, so we can assume we have p:P):
ˇq:(Q→0)→0ˇq(¯q):≡g((p,¯q))where ¯q is of type Q→0 as before. Now, through double-negation elimination, we can turn ˇq into an element of Q, which is exactly what we wanted to construct. If we use “dne” to denote the function that inhabits DNE, then we can define f:P→Q as
f(p):≡dneQ(λ¯q.g((p,¯q)))Thus, thanks to DNE, we were able to construct f from g. So, “P→Q” and “(P×(Q→0))→0” are logically equivalent under DNE and we can safely encode implication as P→Q.
Encoding disjunction
Another somewhat interesting question is what the types look like that you get when negating ∏ and × in order to encode ∃ and ∨. That is, we know that we can get P∨Q from ¬(¬P∧¬Q) in classical logic, but what does that look like as types?
Well, it turns out that if we use × and “→0” to encode P∨Q, then the result is logically equivalent to (P+Q→0)→0, i.e., double-negated P+Q. We will prove that in a second, but let me first address a concern you might have here: if we have double-negated P+Q, doesn’t that mean we get P+Q by our DNE axiom? No! Because P+Q is not (in general) a subsingleton, so our DNE−1 does not apply.
OK, let’s prove that
((P→0)×(Q→0))→0(i.e., ¬(¬P∧¬Q)) is logically equivalent to
(P+Q→0)→0which would mean that we can encode the logical statement “P∨Q” as (P+Q→0)→0. First we’ll prove the direction top to bottom. So, we are given f:((P→0)×(Q→0))→0 and the function h that we have to construct, takes g:(P+Q)→0 as the first argument. The solution is then:
h:(P+Q→0)→0h(g):≡f((λp.g(left(p)),λq.g(right(q))))which proves we can get h from f.
For the opposite direction, we are given h:(P+Q→0)→0, and the function f that we want to construct takes the pair (¯p,¯q):(P→0)×(Q→0) as an argument. We’ll again need a helper function k:P+Q→0 which is defined within f so we have access to ¯p and ¯q. k is defined by case analysis (aka pattern matching):
k(left(p)):≡¯p(p)k(right(q)):≡¯q(q)With this, we can construct f:
f((¯p,¯q)):≡h(k)where k depends on (¯p,¯q) but I’m using loose notation so it’s not indicated.
So, we have proved that ((P→0)×(Q→0))→0 and (P+Q→0)→0 are logically equivalent, and we didn’t even need DNE/LEM for it; this is just always true in type theory. We noted before that A→0 is a subsingleton for any type A, so one way to interpret this result is to say that the double negation turns a non-subsingleton type into a subsingleton while preserving the semantics of it. In other words: P+Q had the problem that it might contain two elements, but (P+Q→0)→0 doesn’t have that problem but still sort of means the same thing (at least if we have DNE; though again, we can’t actually apply DNE here). So it seems double negation might be a general strategy for turning arbitrary types into subsingletons. This is in fact such a useful trick that it has its own notation:
∥A∥≡(A→0)→0We always have A→∥A∥ for any type A, but we only have ∥A∥→A if A is a subsingleton (provided we have assumed DNE−1). That is, ∥A∥ and A are equivalent for subsingletons in the presence of DNE.
We can show something similar for Σ. We know that ∃(x:X).ϕ(x) is equivalent to ¬∀(x:X).¬ϕ(x) in classical logic. This identity can be used to derive that
∥∥ ∥∥∑x:Xϕ(x)∥∥ ∥∥can be used to encode ∃(x:X).ϕ(x). But this derivation is left as an exercise to the reader.
This means we can extend our translation table:
This is now the final scheme; this is the correct way to encode classical first-order logic in type theory. It is fully compatible with DNE/LEM.
As a final note on this: if you look back at our proof that DNE implies LEM, you can see that we applied DNE to the following type:
((B+(B→0))→0)→0which you may think wasn’t allowed after all, because a union type is not a subsingleton. However, in this case, the inner union type is a subsingleton because the two halves of the union are mutually exclusive (at least if our whole theory is consistent), so this was valid. In general, ∥P+Q∥ is equivalent to P+Q if P and Q are mutually exclusive.
At this point, we should verify that with the general rules of type theory, this encoding of first-order logic actually has all the rules of inference, like modus ponens, existential generalization and conjunction introduction. I won’t go through them here to prove them all, but if you’re interested, see this page.
The subuniverse of truth values
Now that we have set up logic, we will use the familiar logic symbols instead of Π, Σ, etc. So, when constructing propositions and functions that output truth values (like predicates and relations), we will use ∃, ∀, ∨, ∧, ¬, ⇒ and ⇔. The tables above define how these symbols get translated to the underlying operators on types. We can, for example, express LEM now as
LEM≡∀(P:U).P∨¬P .However, this is the version of LEM that applies to all types and not just subsingletons. The actual version we use is this
LEM−1≡∀(P:U).isTruthVal(P)⇒(P∨¬P)but wouldn’t it be nicer if we could quantify over the “subuniverse” of all subsingletons directly? This would be especially useful for specifying the type of predicate functions.
Until now we’ve always written the type of a predicate on X as X→U, where U is a universe of types. However, the codomain is not actually all types in the universe; just the subsingletons.
We can define the subuniverse of all subsingletons through the following trick:
TruthValU:≡∑A:UisTruthVal(A) .Let’s think about what that means. Elements of this sum type are pairs where the first element is some type A and the second element is a witness p to the proposition isTruthVal(A). So, such a pair can only exist if the type representing the proposition is inhabited, meaning the proposition is true. Thus, the elements of TruthValU are exactly those types A:U which are subsingletons!
Technically, the elements of TruthValU are pairs of subsingletons and witnesses, (A,p), but we will usually pretend the elements are just the types. In any case, there is an obvious mapping from (A,p) to A.
With this definition, the type of a predicate function on a base type A is ϕ:A→TruthValU. I called this subuniverse “TruthVal” because it encompasses all truth values; that is, it encompasses all types that can encode falseness and truthiness.
However, with our current axioms, we can only define the type of all subsingletons for a particular universe U. That is, TruthValU doesn’t give us all the “falsy” and “truthy” types out there but only the ones in universe U. And this is a problem if we want to talk about power sets.
Power sets and propositional extensionality
The power set type P(A) is of course the type of all “subsets” of some type A. In our theory, subsets are actually predicates – functions from A to truth values – so the power set type is the function type A→TruthValU. And here we see the problem. If this is supposed to be a real power set, then TruthValU better contain all “falsy” and “truthy” types. But it actually only contains those from universe U!
If there were some “truthy” type T (i.e., a type with exactly one element) that’s not in TruthValU but we have a function ϕ that maps some a:A to this type, ϕ(a)≡T, then this function ϕ wouldn’t be captured by the function type A→TruthValU, but it would still be a predicate, so “A→TruthValU” wouldn’t actually encompass all predicates.
So, we need some way to get the type of all the possible truth values.
One radical solution to this problem is drastically reduce the space of truth values. After all, what we want in the end is a two-valued logic, so do we really need a distinction beyond “has no elements” and “has exactly one element”? What TruthVal contains is the empty type 0 and the unit type 1 and then an infinite number of types that are combinations of these: 1×1, 0×1, 0→1, 1×1×0, … But the distinction between these types (beyond whether they’re “truthy” or “falsy”) carries basically no information. So, we might say, “let’s just declare all truthy types to be equal to each other and the same for all falsy types and be done with it”. This would correspond to the principle of propositional extensionality, and would be a solution to our problem, because then we knew that any subsingleton was either equal to 0 or 1 (which are both present in all universes).
We’ll talk about more propositional extensionality when talking about extensionality in general. But for now, let’s assume something a bit weaker (in order to be able to define power sets): There exists a type Ω:U together with an Ω-indexed family of subsingletons (aka, a function f:Πω:ΩP(ω) where all f(ω) are subsingletons) such that this family (the range of f) contains all subsingletons up to equality. Thus, the elements of Ω constitute an index for all falsy and truthy types. And as it is easy to turn an element of Ω into a truth value (via the function f), we can define the power set type to be
P(A):≡(A→Ω)and this will be guaranteed to encompass all predicates on A because Ω indexes all “falsy” and “truthy” types.
The good news is that the existence of Ω follows from LEM and propositional extensionality (mentioned above). So, we won’t actually have to assume a new axiom for Ω if we adopt propositional extensionality (which I think we do want to but more on that later). A possible (and obvious) choice for Ω under LEM and propositional extensionality is in fact 2 – the type that has exactly two elements.
This is kind of funny because that means we have come full circle: in the beginning we decided against using 2 to represent truth values – instead using types that are either inhabited or not – but now we find out that what we ended up constructing is equivalent to using 2 for representing truth values (assuming LEM and propositional extensionality). Still, the concept of propositions-as-types allowed us to reuse all the type operators like ∏ and × to construct first-order logic, so we didn’t have to separately define all these operators for our encoding of logic.
With propositional extensionality, we may thus define the power set type like this instead:
P(A):≡(A→2)which looks quite similar to how it works in set theory.
Now, you might wonder whether we need a new type forming rule/axiom for power sets analogous to the axiom of power set in ZF set theory. But we don’t, because our theory already has a type forming rule for function types and as we saw, we can express the power set type as a function type. It’s actually also true in set theory that you can get the existence of power sets from the existence of function sets (set of all functions from X to Y), but it’s usually done the other way around when axiomatizing set theory.
(Another thing that worried me about this definition of power sets is that in practice I can only access those elements of P(A) for which I can formulate a predicate function in the lambda calculus formalism, which surely doesn’t cover all possible mappings from A to 2, if A is infinite. However, the same problem actually also exists in set theory: in practice I can only access those elements of P(A) for which I have a first-order logic formula proving that they are subsets of A. Still, the meaning we give to P(A) in set theory is that it really does contain all subsets, even if we can’t access all of them. We declare the same thing for function types in type theory.)
With the type Ω, our version of logic actually looks more like second-order logic than first-order, because we can quantify over predicates too:
∀(ϕ:A→Ω).Φ(ϕ)and∃(ϕ:A→Ω).Ψ(ϕ) .In fact, we could also quantify over predicates of predicates (i.e., ϕ′:(A→Ω)→Ω) and so on; meaning it’s actually a higher-order logic. (Though, higher-order logic is of course not any more powerful than second-order logic.) One peculiar aspect of type theory is, however, that we don’t just have a single domain of discourse; instead, we have a lot of different types, and the quantifiers ∀ and ∃ only range over one type at a time. This is similar to many-sorted logics where quantifiers always range over only one “sort”. But I don’t know exactly how to compare the “strengths” of these formal systems. I assume they’re all about equally powerful? Type theory is in any case certainly sufficiently expressive to be affected by Gödel’s incompleteness theorems.
This concludes the third post in this series. Next time, we’ll talk about how to do set-like mathematics in type theory and we’ll discuss the axiom of choice.
Two functions, that return exactly the same values for all inputs, might not be judgmentally equal. More on that later.
We could express that simply as F(A,B):≡(A→B). F will be inhabited if there is a function f:A→B.
This might seem like a somewhat strange definition, but the advantage is that this definition generalizes to 0-types, 1-types, and so on.