# What do functions do to paths?

Let’s say we have a function $f : A \rightarrow B$, and also $x, y : A$, from which we can form the type $x=_Ay$. If we have a witness of this equality, i.e. a path, what should our function do to it? Intuitively, we would like to be able to say that we can create a witness for $f(x) =_B f(y)$ i.e. applying the same function to both sides of an equality should take you to another equality. Or in other words, functions should be truth-preserving. Another way of thinking about it is that functions should be continuous: they preserve paths. Using path induction, we can show that this is indeed the case.

Slightly more complicated is the case where we have a dependent function $f : \prod_{x : A}B(x)$ for a type family $B : A \rightarrow \mathcal U$. The problem is that even if $p : x=_A y$, the types $B(x)$ and $B(y)$ can still be different. Luckily, the path $p : x=_Ay$ can give us a function $p_*: B(x) \rightarrow B(y)$. This is called the transport of the path $p$. If we need to specify $B$ explicitly, we may also write this as $\mathsf{transport}^B(p, -) : B(x) \rightarrow B(y)$.

Notice that we can also look at $p_*^{-1} : B(y) \rightarrow B(x)$. Thinking about types as propositions, it would make sense to say that together, these show that $B(x)$ is equivalent to $B(y)$. This ends up being right in this case! But we need to also specify what properties the paths should have in an equivalence, which will require the notion of a homotopy, which will come later!

So from a topological point of view, what is going on here? Something to do with fibrations, which are apparently a generalization of a fiber bundle. According to Hatcher, it’s a map which has the homotopy lifting property with respect to all spaces. I assume that Lemma 2.3.2 is essentially this property. I really want to wrap my head around this, some I’m going to dive into a specific example: the Hopf fibration.

oooh pretty..

The idea is that we can describe a 3-sphere in terms of circles and a sphere. Every point on the sphere corresponds to a circle. The video gives a good sense of what this correspondence is like, but how does this match with a 3-sphere? Well each of these circles is a circle on the 3-sphere. Locally, it just looks like the product space of $S_1$ and $S_2$, but its global structure is different. The circles are the “fibers” and the sphere is the “bundle” of the fiber bundle. The map $p : S_3 \rightarrow S_2$ is the fibration. Ah! So when we say that the type family $P : A \rightarrow \mathcal U$ is a fibration  with base space $A$, we mean that we can think of $P(x)$ as the fibers, with the total space $\sum_{x:A}P(x)$. The fibration itself is then $\mathsf{pr_1} : \sum_{x:A}P(x) \rightarrow A$. So for the Hopf fibration, $A$ would be the sphere, and $P : A \rightarrow \mathcal U$ would be the map taking each point of the sphere to a circle. The collection of all these circles is $\sum_{x:A}P(x)$, which we can map back down to the sphere with the first projection map. It’s starting to make sense!

To really be a fibration, we need it to have the Path Lifting property. If we have a path $p: x=y$ in the base space $A$, and a point in the fiber above the starting point $x$, i.e. $u : P(x)$, then we can lift $p$ to a path in the total space starting at $u$. It’s probably good to note that this is a constructive proof in HoTT.

Another term that I needed to look up was section. A dependent function $f : \prod_{x:A}P(x)$ is called the section of the fibration $P$. Basically, it’s just giving you a map from the base space to the total space – so a section will give you a collection of particular points from each fiber. Like a cross-section!

Anyway, we still need to show how to apply a dependent function. Our function $f:\prod_{x:A}P(X)$ is a section of $P$. Ideally, we want this to lift a path $p: x=y$ up to a path in the total space. To make sure that this happens, we will use the transport function to lift the path up, and then specify a path in $P(y)$ which connects the end of the transported path to the correct point. The function that specifies this path in $P(y)$ from our function $f$ is called $\mathsf{apd}_f : \prod_{p: x=y}\left(p_*(f(x)) =_{P(y)} f(y)\right)$. While this function has a different type from ap, it satisfies the expected consistency relations with it.

# HoTT in Coq

It seems like I would be missing the point of HoTT if I didn’t learn how to use it in a proof assistant. And it also seemed like it would make things more fun! So I decided to get and learn the HoTT’s computational implementation in Coq.

The installation went relatively smoothly, except that I had to patch one of the configuration files before it would compile (which has since been patched in the official version). I also got Proof General to use it with Emacs. I was impressed by how easy this was to install, even after reading that it was easy to install.

Once everything was working, I poked around and looked at all the theory files. There were some things I recognized, but also a lot of things that were foreign. I found these video tutorials for Coq by Andrej Bauer, which were very helpful. I tried following along, but I didn’t get very far because the firstorder command didn’t work in the HoTT version! But they were still very helpful for showing the way around Coq.

By this time, I was itching to get started proving things with Coq. So first, I tried looking for some of the theorems and definitions from the book in the theory files that came with my version of Coq. I quickly found that the relevant files were Overture.v and PathGroupoids.v. Somewhat confusingly, it looks like Notation is used for definitions, and Definition is used for lemmas. Well ok, whatever works. Once I got a feel for the notation, I tried writing out the type for some of the lemmas, and checking to see if I got it right.

Definition ap_pp {A B : Type} (f : A -> B) {x y z: A} (p : x = y) (q : y = z) :
ap f (p @ q) = (ap f p) @ (ap f q). 

Yay, it works! Now to prove it. (I’ll talk more about ap next time.) Looking at the proofs in the PathGroupoids.v file, it looked like the eliminator could be invoked with destruct. So now everything thing was replaced with identity paths, good good. It looked pretty trivial after this… and after using reflexivity, everything was proved! These two commands got me surprisingly far, and helped cement how induction worked in my head.

I was curious to see what would happen for the $p\cdot p = p$ example from last time. First let’s set things up…

Definition why_not_pp_p {A : Type} {x : A} (p : x = x):
p @ p = p.

*****************************
1 subgoals, subgoal 1 (ID 158)

A : Type
x : A
p : x = x
============================
p @ p = p


Now let’s try destroying p!

Proof.
> destruct p.

******************************
Toplevel input, characters 0-10:
Error: Cannot instantiate metavariable P of type
"forall a : A, x0 = a -> Type" with abstraction
"fun (x0 : A) (p : x0 = x0) => p @ p = p" of incompatible type
"forall x0 : A, x0 = x0 -> Type".


Good! Looks like me and Coq caught the same error. So everything works as expected.

Playing with Coq is actually pretty fun, and it is a very convenient way of writing out (and verifying) my proofs to some of the exercises! Also, if I have a hard time understanding a new concept, writing it out in Coq helps clarify things for me – usually since it forces me to make sure all the types match up, which is usually where my confusion lies.

# Basic properties of identity types

Reflexivity is built into our specification of the identity type, but we really expect it to be symmetric and transitive as well. So section 2.1 of the HoTT book starts out by establishing these. The proofs were straightforward at a formal level, but intuitively, it felt like they shouldn’t work! I got that all you needed to do was essentially check it for the reflexive case, and then it would work – but that just didn’t feel like enough! It seemed like you should be able to take any path $p: x = x$, compose it with itself, and showed that the composition was equal to $p$, since that would be true if you replaced $p$ with the identity path. I also expected this to be false, since I know that the fundamental group of the circle is $\mathbb Z$. Clearly, I wasn’t understanding something!

So let’s work this out carefully, and see what happens. To set things up, take $a : A$, and $p : a =_A a$. We want to try showing that $p \cdot p = p$ is true, i.e. has an element. So encoding this as a type, we set $C : \prod_{x,y : A} (x =_A y) \rightarrow \mathcal U$ to take $x, y : A$ and $p : x =_A y$ to $C(x,y,p) :\equiv p\cdot p = p$. Now, we need to make sure this is true in the reflexive case, i.e. we need $c: \prod_{x : A} C(x,x,\mathsf{refl}_x)$. Expanding this out, if we are given $x : A$, we want to be able to construct an element of type $\mathsf{refl}_x \cdot \mathsf{refl}_x = \mathsf{refl}_x$. Now $\mathsf{refl}_x \cdot \mathsf{refl}_x$ is judgementally equal to $\mathsf{refl}_x$, and so this type is inhabited by the reflexive object for it. So we can take $c:\equiv \mathsf{refl}_{\mathsf{refl}_x}$. Hence by induction, the statement is true! … right??

Did you catch the mistake? Look at $C$ more carefully. We assumed that for any $x, y : A$, and $p : x =_A y$, that $p \cdot p = p$ was a meaningful type. But is it? Let’s unfold the concatenation a bit. We expect $p\cdot p$ to be a specific object of type $x=_A y$, which comes from a function defined by Lemma 2.1.2. So in this case, the function we need should have type $(x=_A y)\rightarrow (x =_A y) \rightarrow (x =_A y)$. But this doesn’t represent transitivity, so it isn’t actually concatenation anymore in general – only when $y$ is judgementally equal to $x$! So it is not the case that for every $x, y : A$ and $p : x =_A y$, that $C(x,y,p):\equiv (p\cdot p = p)$, because $p\cdot p$ doesn’t always make sense! At least not for an arbitrary type $A$. So we can’t use path induction to prove a statement like this unless we know the type $A$ only has one element, in which case $x$ is judgementally equal to $y$.

While I was able to catch the mistake this time, it sure seems like the sort of thing that would be really easy to miss. The only reason I even caught it was because I already knew that there were fundamental groups which were non-trivial, and that these paths were supposed to have the same structure. It would be nice if there was a way to automatically catch these sorts of errors…

# Identity types

Here (section 1.12 of the HoTT book) is where things start to become more novel. We have a type $x =_A y$ for equalities between $x, y :A$, which corresponds to the proposition that $x = y$. An object of this type means the equality is true – but there is more information here! We can think of the different objects of this type as being different ways we can identify our $x$ and $y$. Or more vividly, as a path between $x$ and $y$. Each path is a witness of the veracity of the equality, that is, a proof. This is a really compelling idea, but I think it will take some time for it to sink in at an intuitive level.

Remember that we have a general pattern for specifying a type. Since identity types are a more subtle concept than the other types we have seen so far, let’s write this all out carefully. If you still feel confused about the distinction between elimination and computation (as I was when I wrote the first draft of this post), you should read my post where I explore this issue.  Anyway, here we go!

1. Formation rule. Given a type $A : \mathcal U$, and $a, b : A$, we can form the type $(a =_A b) :\mathcal U$ in the same universe. So this will give us a comparison of any two objects of a given type.
2. Introduction rule. To make this type worthy of being called the identity type, we need an introduction rule that captures this idea properly. Intuitively, it seems like we should be able to make an element of $(a =_A b)$ (i.e. show that $a = b$ is true) if we know that $a$ and $b$ are the same. So our introduction rule shall be a dependent function $\mathsf {refl} : \prod_{a : A} (a =_A a)$. You can think of this as saying that for every object $a$ of type $A$, $a = a$ is true. Or in other words, this comparison that we are defining has the property that it is reflexive (which is why we call this function $\mathsf {refl}$). Note that we get the reflexive element for $(a =_A b)$ if we have a judgemental equality $a:\equiv b$. That’s an important part of the difference between judgemental and propositional equalities in HoTT.
3. Elimination rules. But just because our comparison is reflexive doesn’t mean it is a good notion of equality. We also need to know what we can do with an equality. Well, intuitively, if we know two things are equal, we can substitute one with the other, and expect all the same things to be true. So how do we encode this as a rule? Let’s think of $C : A \rightarrow \mathcal U$ as being a predicate over elements of type $A$, i.e. a statement about an object $a$ of type $A$, which may or may not be true depending on which specific $a$ we have. Alright, so since we are trying to define equality here, we want this to have the property if $x =_A y$, and $C(x)$ is true, then $C(y)$ has to be true as well. Or in other words, we want to guarantee there is a function $\displaystyle\mathsf{rec}_{=_A} : \prod_{x y: A}\prod_{p:x=_A y}C(x)\rightarrow C(y)$. Viola! This is the recursor for the identity type. That’s pretty cool. But we can dream larger!! These aren’t mere propositions! We have all this potential extra structure lying at our feet, the different paths between two of our elements of $A$. These aren’t just witnesses of truth! So instead of looking at predicates over elements of type $A$, let’s look at something that can depend on the path as well, $C : \prod_{x, y:A}(x=_A y) \rightarrow \mathcal U$. But now we risk showing a bit too much – we can think of this as a predicate depending on two elements $x, y$ of $A$, and a path $p : x =_A y$ between them, and it might be the case that there are no witnesses of $C(x,x,\mathsf{refl}_x)$, maybe $C$ is just constantly the empty type. But equality is reflexive! So we want to make sure that whatever we are talking about, it is something which is true in the reflexive case. Hence, we’ll also require a function $c:\prod_{x :A}C(x,x,\mathsf{refl}_x)$. So once we know that $C$ is reflexive, we want to guarantee that it’s still true if we switch out $x$ with something that is equal to it. Formally, we require a function $\displaystyle\mathsf{ind}_{=_A}: \prod_{C: \prod_{x, y:A}(x=_A y) \rightarrow\mathcal U}\prod_{x:A}C(x,x,\mathsf{refl}_x)\rightarrow\prod_{x, y:A}\prod_{p: x=_A y}C(x,y,p).$ This, is path induction. This is what we can always get from an equality.
4. Computation rules. Well we have these awesome recursor and inductor objects, but we aren’t really done until we’ve defined how they act on objects defined by our introduction rule. Fortunately, this is a lot easier to do than determining the elimination rules. For the recursor, this means we need to specify for $x:A$, $\mathsf{rec}_{=_A}(x,x,\mathsf{refl}_x)$. Remember, this is basically saying that in the case when we have the identity path $\mathsf{refl}_x$, and a predicate over elements of $A$, which we call $C$, we have that $C(x)$ is true whenever $C(x)$ is true – kind of a trivial statement. This is represented by a function from $C(x)$ to $C(x)$, which is what we need to specify. So the most natural choice is just to have this be the identity function for $C(x)$. So we say that for $x:A$, the recursor can be computed for the reflexive case as $\mathsf{rec}_{=_A}(x,x,\mathsf{refl}_x):\equiv \mathsf{id}_{C(x)}.$ We do pretty much the same thing in the inductive case. Rather conveniently, we required a function $c:\prod_{x : A}C(x,x,\mathsf{refl}_x)$, which we can think of as saying that for all $x : A$, the reflexive case of our statement $C$ is true. So given $x : A$, and the reflexive path for $x$, we want to show that $C(x,x,\mathsf{refl}_x)$ is inhabited with a specific object. Naturally, we can just choose $c(x)$ as such an object! So we say that $\displaystyle\mathsf{ind}_{=_A}(C,c,x,x,\mathsf{refl}_x):\equiv c(x).$ That takes care of all the computations we need!
5. Uniqueness principle. I don’t quite understand the uniqueness principle yet, but luckily, the identity type doesn’t need one to specify it! If I’m understanding things correctly, its absence is significant, since it ultimately allows there to be a richer homotopy structure – I think the univalence axiom might even play into this. But I don’t really know much about that stuff yet, so take this speculation with a grain of salt.

So that fully specifies the identity type. Great! I’ll briefly mention that there is also an equivalent induction for it called based path induction, where you stick with a certain base element the whole time for one end of all your paths, which can make things more convenient sometimes. Oh yeah, and you can make a disequality type by taking the negation of an identity type.

One thing to keep in mind about identity types is that we only specify the computation rule in the reflexive case. So if we want to prove a statement which depends on something with an identity type, we only have to show that it is true in the reflexive case, since that is enough to show the right inductor exists (with the little $c$ function), and it’s the only case where it is forced to be a certain thing. We’ll see a lot more of this in chapter 2, so don’t worry!

# Elimination vs. Compuation – dissolving my confusion

Previously, I was confused about  how the inductor for the type $A\times B$ could be well defined, since we aren’t assuming that every object of that type has the form $(a, b).$ Though my resolution was basically correct, I didn’t realize until later when I was trying to understand identity types that my understanding of the issue was not complete. It wasn’t until after I had waded through appendix A.2 that I finally got this.

An elimination rule is only saying that given an object of our new type, along with some other things, we can create something of an already known type. This doesn’t tell us what exactly it is though – we need a computation rule to get that. And the computation rule only tells us how to compute this when we created it through an introduction rule.

For example, let’s look at the product type again. Our introduction rule says we can construct an object of type $A\times B$ from an object $a : A$ and an object $b : B$, with $(a, b) : A\times B$. We also have an elimination rule which tells us that given a type family $C : A\times B\rightarrow \mathcal U$, and a function $g: \prod_{x: A}\prod_{y:B}C((x,y))$, we have our inductor for the type, which is an object $\mathsf {ind}_{A\times B}$ of type $\displaystyle\prod_{C:A\times B\rightarrow\mathcal U}\left(\prod_{x: A}\prod_{y:B}C((x,y))\right)\rightarrow\prod_{x:A\times B}C(x)$. The computation rule then says that when we have an object of type $A\times B$ which we constructed using our introduction rule,  then we can define what the inductor does in this case. So we can say that for $a : A$ and $b : B$,  $\mathsf {ind}_{A\times B}(C,g, (a,b)):\equiv g(a)(b)$.

I also said last time that the recursor was not just a baby version of the inductor, since I thought you needed the recursor to define the inductor. But actually, that’s not the case.

Using the above example again, consider your function of type $C: A\times B \rightarrow \mathcal U$. In the general case, you would need the elimination rule for $A\times B$ to define such a function. But in the case where this is a constant function, you can just define the function using the computation rule for function types, since you don’t have to worry about the defining expression $\mathbf\Phi$ having the correct type, being constant. And this case is just the recursor! So yes, you do need what is essentially the recursor to get your feet off the ground, but you don’t have to define the recursor seperately first.

# Starting out

So to start out, I’m  just going through the Homotopy Type Theory book.  The introduction is really inspiring and exciting! It looks like it’s what I hoped it would be.

Sections 1.1-1.3 seem to be pretty straightforward, although I still feel like I’m missing some of the nuance behind judgements, definitional equalities, and the like. Function types make sense, since I have already run into them through Haskell programming. The universes splitting into an infinite hierarchy was slightly disappointing, but completely expected.

The section on dependent function types is where I stopped having a good intuition for what was going on.The notation for a function type, $A \rightarrow B$, makes intuitive sense to me, but for a dependent function type, the notation $\prod_{x:A} B(x)$ is just really hard for me to wrap my head around. Why are they using the product symbol for this? I skipped around a little looking for something that might be helpful, and in section 1.11, they mention that you can think of $\prod_{x:A} B(x)$ as saying, for all $x : A$,  $B(x)$ holds. The idea is that types are like logical propositions, and a type having an object basically means the proposition is true. That helps ground the concept for me, not perfectly, but enough to continue.

Pressing on, section 1.5 has a nice list of what you need to specify a type. Recursion and induction are tricky for me to get at first, but after going through to section 1.10 and seeing many different examples, the concept starts to make sense. The recursor for a type is just a function that you use to define functions on that type, and induction is basically the same thing for dependent functions. For example, if you want to specify a function of type $A\times B \rightarrow C$, you take a function of type $A \rightarrow B \rightarrow C$, and then plug this into the recursor (along with some other data) to get your function.This kind of reminds me of currying….

There was one thing that made me feel a bit uncomfortable. You can construct an object of type $A \times B$ just by taking a tuple $(a, b)$ for $a : A$ and $b : B$. But it is not assumed that every object of type $A \times B$ has this form, this can apparently be proved using the inductor. But then, how is the recursor well-defined? It seemed like it already just assumes you can split any object of type $A \times B$ into a tuple $(a,b)$. After rereading the section, I realized that the function is defined for elements that are tuples, and that’s it! That’s all you need to know to define a function.

Also, I thought at first that recursion was just the baby version of induction, but actually, it looks like you need to have the recursor first to be able to construct an element of the inductor.

Dependent pair types were also a bit confusing, but things are starting to fit together better now. In section 1.8, on the binary type, it’s finally explained why the product and sum notation for dependent types makes sense. You can think of $A\times B :\equiv \prod_{x:\mathbf{2}} \mathsf{rec}_{\mathbf{2}}(\mathcal U, A, B, x)$, and $A + B :\equiv \sum_{x:\mathbf{2}}\mathsf{rec}_{\mathbf{2}}(\mathcal U, A , B, x)$. I like that!

For the natural number type, I was really confused by how we could use a function $\mathbb N \rightarrow C \rightarrow C$ in the recursor for the natural number type! Wasn’t the whole point of recursors to tell you how to build functions from that type? Luckily, section 1.10 clears up the issue – this is simply a convenient notational shortcut.

Section 1.11 was really useful for me, and helped me develop a good bit of intuition for what was going on. It was really nice to think about types as being propositions, with objects of that type witnessing the type’s truth. Since there can be multiple objects of a certain type, there is more information kept than in normal logic. At first I felt kinda weird about the law of excluded middle being excluded, but after seeing that you could still add it back in if you needed, and that it really just gave you another level of nuance of proofs, and that you automatically had nice constructibility properties – it began to feel more natural and good to me.

# Motivations

Well, I’ve always been interested in learning how things really work at the deepest most fundamental level, but somehow, I ended up becoming a number theorist. *shrug* I taught myself some axiomatic set theory when I was younger – but I always thought it was kind of ugly. So imagine my excitement when I came across a new foundational system for mathematics based on type theory. As soon as I saw it, I knew that I needed to learn it! It just looks so sleek and powerful – although I can’t say that I’m in a good position to judge that yet.

My background doesn’t really intersect naturally with any of the ways of thinking about HoTT. I don’t know type theory, except for basic stuff I picked up from programming. I only know a little tiny bit about homotopy theory – just stuff from the first few chapters of Hatcher’s Algebraic Topology. My knowledge of logic is rather minimal, and I don’t really know that much category theory either. I also haven’t used anything like Coq before. So this should be lots of fun!

The reason I wanted to make a blog was just to have a place to write out my ideas, and keep my insights written up somewhere. It might also be nice to have other people giving me feedback and direction, should they stumble across this. And hopefully, it could be helpful for other people interested in learning this stuff but who also don’t really have any background. I don’t want it to be too serious – I feel like I am most creative when things are fun and light-hearted. So yeah, there might be silly things here sometimes :P

There’s another hope I have in learning this, and which actually happens to be how I came across this in the first place. I’ve been interested for some time in the problem of constructing a Friendly Artificial Intelligence – a general artifical intelligence that shares human values, instead of just killing us all or whatever. There’s this organization called MIRI which is doing research on this problem, and their director, Luke Muelhauser, recently interviewed John Baez. And in the comments of the interview on Baez’s blog, Baez started talking about Homotopy Type Theory, and Univalent Foundations! Anyway, may daydream is that maybe thinking about some of the problems (the Tiling Agents problem is the one I have in mind, specifically) from a HoTT perspective might yield some interesting insights. But that is still very far away, and just a dream for now…