Contents

Idea

In deep learning and game theory, we usually think of neural networks/economic agents as processes taking in an input $A$ and producing an output $B$. However, we additionally want to model that these processes have extra, “hidden” inputs not available to the outside world. In neural networks we call these weights (or parameters) and in game theory we call these strategies.

In other words, we want to form a category where a morphism $A \to B$ contains the data of a) a parameter space $P$ and b) a morphism $f : P \otimes A \to B$. From this description we see that this construction necessitates a choice of some underlying monoidal category $\mathcal{C}$.

Such a morphism might be visualised using the string diagram language of monoidal categories (below, left). However, this notation does not emphasise the special role played by $P$, which is part of the data of the morphism itself. Parameters and data in machine learning have different semantics; by separating them on two different axes, we obtain a graphical language which is more closely tied to these semantics (below, right). This gives us an intuitive way to compose parameterised maps: This construction is called $\mathbf{Para}(\mathcal{C})$, originally introduced in (Fong, Spivak and Tuyeras 2019) in a specialised form, then successively refined in (Gavranovic 2019), (Capucci et al. 2020) and (Cruttwell et al. 2021).

Definition

Let $(\mathbf C, I, \otimes)$ be a symmetric monoidal category. Then $\mathbf{Para}(\mathcal{C})$ is a bicategory with the following data:

• Its 0-cells are the objects of $\mathcal{C}$.
• A 1-cell $A \to B$ is a choice of a parameter object $P : \mathcal{C}$ and a morphism
$f : P \otimes A \to B$

in $\mathcal{C}$.

• A 2-cell $(P, f) \Rightarrow (Q, g)$ is a morphism $r : P \to Q$ in $\mathcal{C}$ such that $f = g \circ (r \otimes A)$.

The sequential composition of a map $f : P \otimes A \to B$ and $g : Q \otimes B \to C$ is given by the animation above. The composite is a $Q \otimes P$-parameterised map defined as

$Q \otimes P \otimes A \xrightarrow{Q \otimes f} Q \otimes B \xrightarrow{g} C$

The construction defined here works in the general setting of actegories.

todo

todo

Examples

When the base category is set to be the category of optics (in computer science), then $\mathbf{Para(\mathbf{Optic(\mathcal{C})})}$ recovers the category of neural networks defined in (Capucci et al. 2020).