From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport

¹Télécom Paris, Institut Polytechnique de Paris, ²Paris Noah’s Ark Lab, ³Smartly.io, ⁴Aalto University, ⁵University of Manchester
CVPR 2025
^*Indicates Equal Contribution

Abstract

In the last decade, we have witnessed the introduction of several novel deep neural network (DNN) architectures exhibiting ever-increasing performance across diverse tasks. Explaining the upward trend of their performance, however, remains difficult as different DNN architectures of comparable depth and width -- common factors associated with their expressive power -- may exhibit a drastically different performance even when trained on the same dataset. In this paper, we introduce the concept of the non-linearity signature of DNN, the first theoretically sound solution for approximately measuring the non-linearity of deep neural networks. Built upon a score derived from closed-form optimal transport mappings, this signature provides a better understanding of the inner workings of a wide range of DNN architectures and learning paradigms, with a particular emphasis on the computer vision task. We provide extensive experimental results that highlight the practical usefulness of the proposed non-linearity signature and its potential for long-reaching implications.

Applications

Residual connections introduce a clear trend toward a bi-modal distribution of affinity scores.

Clustering of architectures with pairwise DTW distances between non-linearity signatures.

Unique measure: no consistent correlation with other measures.

BibTeX

@InProceedings{Bouniot_2025_CVPR, author = {Bouniot, Quentin and Redko, Ievgen and Mallasto, Anton and Laclau, Charlotte and Struckmeier, Oliver and Arndt, Karol and Heinonen, Markus and Kyrki, Ville and Kaski, Samuel}, title = {From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport}, booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)}, month = {June}, year = {2025}, pages = {25250-25260} }

From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport

Abstract

Affinity score: principle tool for measuring non-linearity

Affinity Score

Simple polynomial functions

Popular activation functions

Non-linearity signature of a DNN

Non-linearity signature

Walking through DNNs’ history

Early Convolutional Neural Networks

Deeper Networks

Post-ViT Era

Statistics across architectures

Applications

Residual connections introduce a clear trend toward a bi-modal distribution of affinity scores.

Clustering of architectures with pairwise DTW distances between non-linearity signatures.

Unique measure: no consistent correlation with other measures.

Poster

BibTeX