In deep learning and game theory, we usually think of neural networks/economic agents as processes taking in an input $A$ and producing an output $B$. However, we additionally want to model that these processes have extra, βhiddenβ inputs not available to the outside world. In neural networks we call these weights (or parameters) and in game theory we call these strategies.
In other words, we want to form a category where a morphism $A \to B$ contains the data of a) a parameter space $P$ and b) a morphism $f : P \otimes A \to B$. From this description we see that this construction necessitates a choice of some underlying monoidal category $\mathcal{C}$.
Such a morphism might be visualised using the string diagram language of monoidal categories (below, left). However, this notation does not emphasise the special role played by $P$, which is part of the data of the morphism itself. Parameters and data in machine learning have different semantics; by separating them on two different axes, we obtain a graphical language which is more closely tied to these semantics (below, right).
This gives us an intuitive way to compose parameterised maps:
This construction is called $\mathbf{Para}(\mathcal{C})$, originally introduced in (Fong, Spivak and Tuyeras 2019) in a specialised form, then successively refined in (Gavranovic 2019), (Capucci et al. 2020) and (Cruttwell et al. 2021).
Let $(\mathbf C, I, \otimes)$ be a symmetric monoidal category. Then $\mathbf{Para}(\mathcal{C})$ is a bicategory with the following data:
in $\mathcal{C}$.
The sequential composition of a map $f : P \otimes A \to B$ and $g : Q \otimes B \to C$ is given by the animation above. The composite is a $Q \otimes P$-parameterised map defined as
The construction defined here works in the general setting of actegories.
todo
todo
When the base category is set to be the category of optics (in computer science), then $\mathbf{Para(\mathbf{Optic(\mathcal{C})})}$ recovers the category of neural networks defined in (Capucci et al. 2020).
Brendan Fong, David Spivak, RΓ©my TuyΓ©ras, Backprop as Functor: A compositional perspective on supervised learning, 34th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS) 2019, pp. 1-13, 2019. (arXiv:1711.10455, LICSβ19)
Bruno GavranoviΔ, Compositional Deep Learning, (arXiv:1907.08292)
Matteo Capucci, Bruno GavranoviΔ, Jules Hedges, Eigil Fjeldgren Rischel, Towards Foundations of Categorical Cybernetics, (arXiv:2015.06332)
G.S.H. Cruttwell, Bruno GavranoviΔ, Neil Ghani, Paul Wilson, Fabio Zanasi, Categorical Foundations of Gradient-Based Learning, (arXiv:2103.01931)