A topos is a categorical interpretation of boolean logic. We will start with some prerequisite set theory and category theory, build up the intuition for subobject classifiers, give an interpretation of set theory, and then move to formalizing boolean algebra using sheaves. The study of topoi is the study of cartesian-closed categories with subobject classifiers, and a category of sheaves over $\mathbb{B}$ is a complete boolean algebra.

## Some set theoretic foundations

The study of set theory starts with the study of `Russell's paradox`, and axiomatization into `Zermelo-Frankel set theory` with no paradoxes. To understand Russell's paradox, we first introduce the `principle of comprehension`: $\{x : \varphi(x)\}$ says that $x$ is an element of a set that satisfies the predicate $\varphi(x)$. Now, we construct the `Russell set` $R = \{x : x \not\in x\}$, and ask if $R$ is a member of this set. If $R \not\in R$, then its elements belong to a set satisfying the predicate $x \in x$, yielding $R \in R$; if $R \in R$, then its elements belong to a set satisfying the predicate $x \not\in x$, yielding $R \not\in R$. This leads to a contradictory conclusion.

In ZF set theory, we say that a set can only be built up from another set using logical connectives like $\cup$, $\cap$, and $-$. The axioms can only be applied to a set and the result is always a set. The comprehension principle is replaced by a `separation principle`: Given a set $A$, and a predicate $\varphi(x)$, there exists a set whose members are precisely the members of $A$ that satisfy the predicate. $R$ isn't admitted as a set in this theory, thus resolving the contradiction.

In category theory, the central concept of membership of an element in a set is replaced by the concept of a morphism between objects whose internal structures (elements) are opaque.

## Functions to elementary category theory

We would first like to ask how functions are formalized in terms of sets. A function $f$ takes objects of $\text{dom } f$ to objects of $\text{codom } f$, represented as the pair $f = \langle x, y\rangle$, such that $\langle x, y\rangle = \langle z, y\rangle \implies x = z$. This formalism is weak because it doesn't include the domain, codomain, or the relation explicitly. To remedy this, we define the $f = \langle A, B, R\rangle$, with $A$ as the domain, $B$ the codomain, and $R \subseteq A \times B = \{\langle x, y\rangle : x \in A, y \in B\}$ the relation. Note that functions are composable yielding a unique $f \circ g = f(g(x))$ satisfying the associative law $f \circ (g \circ h) = (f \circ g) \circ h$.

We can further abstract functions to arrows between objects in a category. The notion of "set membership" is an "internal notion" between the various members of the object, and needs to be "externalized". Let's start with some category theory.

A category $\mathcal{C}$ consists of a bunch of objects, a bunch of arrows between the objects, including the identity arrow $a \rightarrow a$ for each $a \in C$, with an associative law for compositions:

$$ \begin{xy} \xymatrix{ a\ar[r]^{f}\ar[rd]_>>>>>{h \circ g}\ar[d]_{h \circ g \circ f = h \circ (g \circ f)} & b\ar[d]^{g}\ar[dl]^<<<<<{g \circ f} \\ c & d\ar[l]^{h} } \end{xy} $$

A `pre-order` is defined as a category in which there is at most one arrow, $\langle p, q\rangle$, between any two objects $p$ and $q$. In this kind of category, called there is only one way to define composites. If $p$ and $q$ are any objects on a pre-order category, and $R$ is a binary relation, then we can identify two properties of $R$:

- Reflexivity: for each $p \in \mathcal{C}$, we have $pRp$, the identity arrow.
- Transitivity: whenever $pRq$ and $qRs$, we have $pRs$.

The typical example is to think of two non-comparable subsets of a set. With identity arrows omitted, this looks like:

$$ \begin{xy} \xymatrix{ \bullet\ar@/^/[rd] & \\ & \bullet\ar@/^/[lu] \\ \bullet\ar[ru] & \\ } \end{xy} $$

With one additional property described below, a `pre-order` becomes a `partial ordering`:

- Anti-symmetry: whenever $pRq$ and $qRp$, we have $p = q$.

$$ \begin{xy} \xymatrix{ \bullet\ar[rd] & \\ & \bullet \\ \bullet\ar[ru] & \\ } \end{xy} $$

A set $P$ together with the relation $\sqsubseteq$, denoted as $\langle P, \sqsubseteq \rangle$, is called a `poset`. The specific category we have used to encode $\mathbb{N}$ above is called `$\textbf{Finord}$`, the category of finite ordinals. In $\textbf{Finord}$, these three statements are equivalent:

- $m \lt n$
- $m \subset n$
- $m \in n$

With one additional property, a `partial ordering` becomes a `total ordering`:

- Connexity: For all $p$, $q \in \mathcal{C}$, `pRq` or `qRp`.

$$ \begin{xy} \xymatrix{ \bullet\ar@/^/[r] & \bullet \ar@/^/[r] & \bullet } \end{xy} $$

Using $\sqsubseteq$ to denote the relation, we can encode $\mathbb{N}$ as $0 = \{\} = \phi$, $1 = \{\phi\}$, $2 = \{\phi, \{\phi\} \{\phi, \phi\}\}$, and so on, yielding $0 \sqsubseteq 1 \sqsubseteq 2 \sqsubseteq \ldots$

$$ \begin{xy} \xymatrix{ 0\ar[r] & 1\ar[r] & 2\ar[r] & \ldots } \end{xy} $$

This is a total order.

Let us now consider categories with more than one arrow between two objects. A `monoid` is defined as an algebraic structure $(M, \circ, \mathbb{1})$ where $\circ$ is a function from $M \times M \rightarrow M$. A monoid naturally gives rise to a category with just one object: $M$. We can instantiate it with $M = \mathbb{N}$, $\circ = +$, and $\mathbb{1} = 0$ to obtain the familiar notion of addition of natural numbers:

$$ \begin{xy} \xymatrix{ \mathbb{N}\ar[r]^{m}\ar[dr]_{m + n} & \mathbb{N}\ar[d]^{n} \\ & \mathbb{N} } \end{xy} $$

$$ \begin{xy} \xymatrix{ \mathbb{N}\ar[r]^{m}\ar[dr]_{m} & \mathbb{N}\ar[d]^{0}\ar[dr]^{n} & \\ & \mathbb{N}\ar[r]_{n} & \mathbb{N} } \end{xy} $$

If $a, b \in \mathcal{C}$, then we use $\mathcal{C}(a, b)$ to denote the set of arrows from $a$ to $b$. A `subcategory` $\mathcal{C} \subseteq \mathcal{D}$ is defined as:

- All objects of $\mathcal{C}$ are objects of $\mathcal{D}$.
- $\mathcal{C}(a ,b) \subseteq \mathcal{D}(a, b)$; all arrows of $\mathcal{C}$ are present in $\mathcal{D}$.

A `full subcategory` $\mathcal{C}$ of $\mathcal{D}$ is defined as $\mathcal{C} \subseteq \mathcal{D}$ such that $\mathcal{C}(a, b) = \mathcal{D}(a, b)$; in other words, $\mathcal{D}$ has no arrows $a \rightarrow b$ other than the ones already present in $\mathcal{C}$. $\textbf{Finord}$ is a full subcategory of $\textbf{Finset}$, the category of finite sets.

Before introducing comma categories, let us briefly introduce arrow categories as a prerequisite. An `arrow category` $\textbf{Set}^\rightarrow$ has as objects, the functions $f : A \rightarrow B$, and arrows between objects $f : A \rightarrow B$ and $g : C \rightarrow D$, the pair of arrows $\langle h, k\rangle$ such that the following diagram commutes:

$$ \begin{xy} \xymatrix{ A\ar[r]^{h}\ar[d]_{f} & C\ar[d]^{g} \\ B\ar[r]_{k} & D } \end{xy} $$

A `comma category` can be viewed as a special case of an arrow categories with fixed domain or codomain. $\textbf{Set} \downarrow \mathbb{R}$ is the category of real-valued functions; the codomain of the category is fixed to $\mathbb{R}$. An arrow from $f: A \rightarrow \mathbb{R}$ to $g: B \rightarrow \mathbb{R}$ is the arrow $k: A \rightarrow B$ that makes the following commute:

$$ \begin{xy} \xymatrix{ A\ar[rd]_{f}\ar[rr]^{k} & & B\ar[dl]^{g} \\ & \mathbb{R} & } \end{xy} $$

$\textbf{Set} \downarrow \mathbb{R}$ can be construed to a $\textbf{Set}^\rightarrow$ as follows:

$$ \begin{xy} \xymatrix{ A\ar[d]_{f}\ar[r]^{k} & B\ar[d]^{g} \\ \mathbb{R}\ar[r]^{id} & \mathbb{R} } \end{xy} $$

Similarly, we can define $\mathcal{C} \uparrow a$ to be a comma category with a fixed domain instead of codomain:

$$ \begin{xy} \xymatrix{ & a\ar[ld]_{f}\ar[rd]^{g} & \\ b\ar[rr]^{k} & & c } \end{xy} $$

## Set membership, categorically

To work our way up to `subobject classifiers`, we need to build some prerequisite material. First, let us introduce the notion of `monic` and `epic` arrows, corresponding to injective and surjective functions in $\textbf{Set}$. $f$ is injective if $f(x) = f(y) \implies x = y$, and surjective if $\forall y, \exists x : y = f(x)$.

$$ \begin{xy} \xymatrix{ a\ar[r]^{g}\ar[d]_{h} & b\ar[d]^{f} \\ b\ar[r]_f & c } \end{xy} $$

If $f$ is monic, indicated as $f : b \rightarrowtail c$, the diagram above commutes, and $(f \circ g)(x) = (f \circ h)(x) = f(g(x)) = f(h(x)) \implies g = h$. In other words, a monic is said to be `left-cancelable`.

$$ \begin{xy} \xymatrix{ a\ar[r]^{f}\ar[d]_{f} & b\ar[d]^{g} \\ b\ar[r]_h & c } \end{xy} $$

If $g \circ f$ is monic, so is $f$.

$$ \begin{xy} \xymatrix{ \bullet\ar@<-0.5ex>[r]\ar@<0.5ex>[r] & \bullet\ar[dr]_{f}\ar[rr]^{g \circ f} && \bullet \\ && \bullet\ar[ru]_{g} & } \end{xy} $$

If $f$ is epic, indicated as $f : a \twoheadrightarrow b$, the diagram above commutes, and $(g \circ f)(x) = (h \circ f)(x) = g(f(x)) = h(f(x)) \implies g = h$. In other words, an epic is said to be `right-cancelable`.

An `iso` arrow, indicated as $f : a \rightarrowtail\mathrel{\mspace{-11mu}}\twoheadrightarrow b$, corresponds to a bijective function in $\textbf{Set}$. It is invertible, and its action can be thought of as a "relabeling"; as an example, $A \times \{0\}$ attaches $0$ to every element of $A$.

Every iso is both monic and epic, but "iso" isn't equivalent to "monic and epic". Consider $(\mathbb{N}, +, 0)$ in which all arrows are monic and epic, but the only iso arrows are the identities, since no natural number has an additive inverse.

$f \circ g$ is iso, if $f$ and $g$ are, with $(f \circ g)^{-1} = g^{-1} \circ f^{-1}$.

$$ \begin{xy} \xymatrix{ A\ar@(ul, dl)[]_{\texttt{id}_A}\ar@/^/[rr]^{f \circ g} && B\ar@(ur, dr)[]^{\texttt{id}_B}\ar[ld]^{f^{-1}}\ar@/^/[ll]^{(f \circ g)^{-1}} \\ & \bullet\ar[lu]^{g^{-1}} & } \end{xy} $$

An `initial object` $\mathbb{0}$ is such that, for every $a$ in $\mathcal{C}$, there exists only one arrow $\mathbb{0} \rightarrow a$, and a `terminal object` $\mathbb{1}$ is such that, for every $a$ in $\mathcal{C}$, there exists only one arrow $a \rightarrow \mathbb{1}$. In $\textbf{Set}$, $\phi$ is distinguished to be the initial object because there exists only one function from $\phi$ to every other element in the category, namely $\phi$, the empty function; the graph of $\phi \times a$ is empty; and $\{e\}$, the singleton set is distinguished to be the terminal object.

All initial objects are isomorphic to each other. By duality, all terminal objects are also isomorphic to each other.

Indeed, let $\mathbb{0}, \mathbb{0'}$ be two initial objects. Consider $f : \mathbb{0} \rightarrow \mathbb{0'}, g : \mathbb{0'} \rightarrow \mathbb{0}$ so that $g \circ f = \mathbb{0'} \rightarrow \mathbb{0'} = \texttt{id}_{\mathbb{0'}}$ and $f \circ g = \texttt{id}_{\mathbb{0}}$; so $f = g^{-1}$, and $\mathbb{0} \simeq \mathbb{0'}$.

$\mathbb{1} \rightarrow a$ is monic.

$$ \begin{xy} \xymatrix{ \bullet\ar@<-0.5ex>[r]_{g}\ar@<0.5ex>[r]^{h} & \mathbb{1}\ar[r]^f & a } \end{xy} $$

$f \circ g = f \circ h \Leftrightarrow g = h$. Now, $g = h$ is true because there is only one arrow to it from any given object, and hence, $f$ is monic.

$\langle \texttt{pr}_a, \texttt{pr}_b\rangle = \mathbb{1}_{a \times b}$.

$$ \begin{xy} \xymatrix{ & a \\ a \times b\ar[ur]^{\texttt{pr}_a}\ar[dr]_{\texttt{pr}_b}\ar@{.>}[r]^{\langle \texttt{pr}_a, \texttt{pr}_b \rangle} & a \times b\ar[u]_{\texttt{pr}_a}\ar[d]^{\texttt{pr}_b} \\ & b } \end{xy} $$

By uniqueness of identity arrow, $\mathbb{1}_{a \times b} = \langle \texttt{pr}_a, \texttt{pr}_b\rangle$.

If $\langle f, g\rangle = \langle k, h\rangle$, then $f = k$ and $g = h$.

$$ \begin{xy} \xymatrix{ & a \\ c \ar@{.>}[r]^{\langle f, g\rangle}_{\langle k, h\rangle}\ar[ur]^p\ar[rd]_q & a \times b\ar[u]_{\texttt{pr}_A}\ar[d]^{\texttt{pr}_B} \\ & b } \end{xy} $$

By definition of product, since this diagram commutes,

$$ \begin{matrix} p &=& \texttt{pr}_A \circ \langle f, g \rangle = f \\ &=& \texttt{pr}_A \circ \langle k, h \rangle = k \\ q &=& \texttt{pr}_B \circ \langle f, g \rangle = g \\ &=& \texttt{pr}_B \circ \langle k, h \rangle = h \end{matrix} $$

Hence, $f = k$ and $g = h$.

$\langle f \circ h, g \circ h\rangle = \langle f, g\rangle \circ h$.

The following diagram commutes:

$$ \begin{xy} \xymatrix{ & & a \\ d\ar[r]^h\ar@/^/[rru]^p\ar@/_/[rrd]_q & c\ar@{.>}[r]^{\langle f, g\rangle}\ar[ru]^f\ar[rd]_g & a \times b\ar[u]_{\texttt{pr}_A}\ar[d]^{\texttt{pr}_B} \\ & & b } \end{xy} $$

yielding:

$$ \begin{matrix} p = f \circ h = \texttt{pr}_A \circ \langle f, g\rangle \circ h \\ q = g \circ h = \texttt{pr}_B \circ \langle f, g\rangle \circ h \end{matrix} $$

Combining $\texttt{pr}_A$ and $\texttt{pr}_B$, we get:

$$ \langle f, g\rangle \circ h = \langle f \circ h, g \circ h\rangle $$

$a \times b \simeq b \times a$.

For isomorphism, there needs to exist $f : a \rightarrow b$ and $g : b \rightarrow a$, such that $f \circ g = g \circ f = \mathbb{1}$. From the diagram, we see that this is indeed the case:

$$ \begin{xy} \xymatrix{ a \times b\ar@/^/[rr]^{\texttt{pr}_B \times \texttt{pr}_A} & & b \times a\ar@/^/[ll]^{\texttt{pr}_A \times \texttt{pr}_B} } \end{xy} $$