From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport

1Télécom Paris, Institut Polytechnique de Paris,   2Paris Noah’s Ark Lab,   3Smartly.io,   4Aalto University,   5University of Manchester
CVPR 2025

*Indicates Equal Contribution

Abstract

In the last decade, we have witnessed the introduction of several novel deep neural network (DNN) architectures exhibiting ever-increasing performance across diverse tasks. Explaining the upward trend of their performance, however, remains difficult as different DNN architectures of comparable depth and width -- common factors associated with their expressive power -- may exhibit a drastically different performance even when trained on the same dataset. In this paper, we introduce the concept of the non-linearity signature of DNN, the first theoretically sound solution for approximately measuring the non-linearity of deep neural networks. Built upon a score derived from closed-form optimal transport mappings, this signature provides a better understanding of the inner workings of a wide range of DNN architectures and learning paradigms, with a particular emphasis on the computer vision task. We provide extensive experimental results that highlight the practical usefulness of the proposed non-linearity signature and its potential for long-reaching implications.

Affinity score: principle tool for measuring non-linearity

Affinity Score

Affinity Score

The affinity score measures how much the output of a function differs from being a positive semi-definite (PSD) affine transformation of the input.

Simple polynomial functions

Simple polynomial functions


MY ALT TEXT

Popular activation functions

Non-linearity signature of a DNN

Non-linearity Signature

Non-linearity signature

The Non-linearity signature of a DNN is the vector of affinity scores for each activation functions across all layers of the network, computed for a given input distribution.

Walking through DNNs’ history

MY ALT TEXT

Early Convolutional Neural Networks


MY ALT TEXT

Deeper Networks


MY ALT TEXT

Post-ViT Era


MY ALT TEXT

Statistics across architectures

Applications

Poster

BibTeX

@InProceedings{Bouniot_2025_CVPR,
    author    = {Bouniot, Quentin and Redko, Ievgen and Mallasto, Anton and Laclau, Charlotte and Struckmeier, Oliver and Arndt, Karol and Heinonen, Markus and Kyrki, Ville and Kaski, Samuel},
    title     = {From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport},
    booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
    month     = {June},
    year      = {2025},
    pages     = {25250-25260}
}