From Shapley Values to Tensor Networks: The TN-SHAP Series

One idea — the multilinear structure of coalition games — ties together three projects on scalable, structured feature attribution across tabular models, graphs, and quantum neural networks.

1. Introduction

Feature attribution answers a deceptively simple question: which inputs drove this prediction, and by how much? Good answers matter for trust and debugging, for scientific discovery, and — increasingly — for safety, where we want to know what a model is actually keying on.

Shapley values, borrowed from cooperative game theory, are among the most principled ways to answer it. They are the unique attribution satisfying a short list of natural axioms (efficiency, symmetry, dummy, linearity), which is why they underpin methods like SHAP. But that principle comes at a price: computing a Shapley value exactly requires reasoning over all $2^n$ subsets of features, and higher-order interactions are worse still. Sampling estimators such as KernelSHAP trade accuracy for speed, and they are especially noisy for interactions.

The TN-SHAP series is built on one observation that turns this exponential problem into a tractable, linear-algebraic one:

The central idea

The mathematical object behind Shapley values — the multilinear extension of the coalition game — is a tensor. When that tensor is low-rank (or, for some models, exactly structured), a tensor network represents it compactly, and the exponential sum over coalitions becomes a cheap tensor contraction.

Three projects instantiate this backbone in three different worlds: TN-SHAP for tabular models, TN-SHAP-G for graphs, and TN-SHAP-Q for quantum neural networks.

Masked coalition game v(S) = f(x masked to S) Multilinear extension F(z), z in [0,1]ⁿ Tensor-network object bond dimension χ TN-SHAP Tabular models TN-SHAP-G Graph models TN-SHAP-Q Quantum neural nets Shapley values φᵢ · interaction indices Iᵢⱼ
The shared backbone: a masked coalition game becomes a multilinear extension, compressed into a tensor-network object, and instantiated for three domains — all yielding the same attribution quantities.

2. The common framework

Everything below rests on four objects.

Coalition games

Fix an input $x$ with $n$ features and a baseline (a reference "off" value for each feature). For a subset $S \subseteq [n]$, let $x_S$ be $x$ with the features outside $S$ replaced by their baseline. The coalition game is the set function

$$v : 2^{[n]} \to \mathbb{R}, \qquad v(S) = f(x_S),$$

the model's output when only the features in $S$ are "present." This is the game whose players are features (or, later, graph nodes or circuit gates).

Multilinear extensions

Owen's multilinear extension lifts this discrete game to a smooth function on the cube $[0,1]^n$, where each coordinate $z_i$ is the probability that feature $i$ is present:

$$F(z) = \sum_{S \subseteq [n]} v(S) \prod_{i \in S} z_i \prod_{i \notin S} (1 - z_i), \qquad z \in [0,1]^n.$$

$F$ is multilinear: affine in each coordinate separately. It agrees with the game on the cube's corners, $F(\mathbf{1}_S) = v(S)$, so it loses no information — it just makes the game differentiable.

Shapley values & interactions

The Shapley value has a clean closed form as a diagonal integral of this extension (Owen):

$$\phi_i = \int_0^1 \frac{\partial F}{\partial z_i}(t, t, \dots, t)\, dt.$$

Reading a feature's importance is thus differentiate the extension, then integrate along the diagonal. Second-order Shapley interaction indices $I_{ij}$ come from the mixed partials $\partial^2 F / \partial z_i \partial z_j$, and order-$k$ interactions from $k$-th mixed partials.

Tensor-network compression

Written out, $F$ has $2^n$ coefficients — one per coalition — so nothing has been gained yet. The key move is that $F$ can be viewed as a tensor, and for many models that tensor is (approximately) low-rank. Representing it as a tensor network with bond dimension $\chi$ makes the derivatives and diagonal integrals above into contractions of small cores.

Why this is fast

Once $F$ lives in a tensor network of cut-rank $\chi$, order-1 and order-2 Shapley quantities can be read off in roughly $O\!\left(n\,\mathrm{poly}(\chi) + n^2\right)$ time — instead of summing over $2^n$ coalitions. The exponential sum has become a contraction.

3. TN-SHAP — the tabular setting

TN-SHAP is the base case: a tabular predictor $f$. It fits a feature-mapped tensor-network surrogate (a binary tensor tree / MPS-style factorization) to the local coalition game around a point of interest, so the model's local behavior is written as a factorized multilinear map.

Core contribution. Once the surrogate is fit, order-1 and order-2 Shapley interactions are computed exactly on the surrogate in $O(n\,\mathrm{poly}(\chi) + n^2)$ time, where $\chi$ is the tensor network's maximal cut rank. On UCI benchmarks (Diabetes, Concrete, Energy, California Housing) this matches exact enumeration on the fitted surrogate while delivering 25–1000× wall-clock speedups over KernelSHAP-IQ, and it scales to dozens of features where enumeration is hopeless.

Training curves: overall surrogate fit (Train R squared) and Shapley-interaction fidelity for orders 1, 2 and 3 rising over epochs.
As the tensor-network surrogate trains, it recovers not just the prediction (Train $R^2$) but the order-1, order-2, and order-3 Shapley interaction indices (SII $R^2$). Higher-order interactions are learned last. (From the TN-SHAP / AISTATS 2026 materials.)

4. TN-SHAP-G — the graph setting

TN-SHAP-G moves to graph-structured inputs, where the model is $f(G, X)$ and the players are nodes. The masked graph game is $v(S) = f(G, X_S)$, with excluded nodes set to a baseline.

Graph-aware coalitions. Instead of a generic surrogate, TN-SHAP-G learns one whose topology mirrors the graph: a tensor core per node, with physical dimension 2 (in / out of the coalition) and a bond of dimension $\chi$ along each incident edge. The surrogate therefore respects the graph's connectivity by construction.

Node and edge attributions. Because the surrogate is a clean multilinear object, node Shapley values are computed deterministically — no Monte-Carlo sampling — via the same diagonal-derivative integral,

$$\phi_u = \int_0^1 \frac{\partial \hat{\nu}}{\partial z_u}(t\,\mathbf{1})\, dt,$$

evaluated exactly by polynomial interpolation at Chebyshev nodes (a Vandermonde solve). Order-2 interaction indices come from the same machinery, naturally read off on the graph's edges. On small graphs the method matches exact Shapley enumeration (order-1 cosine similarity above 0.98, order-2 above 0.95 on the toy benchmark) and it has been demonstrated on molecular benchmarks such as mutagenicity.

A · MASKED GRAPH GAMEB · GRAPH-ALIGNED TNC · SHAPLEY + INTERACTIONSν(S) = f(G, features of S kept)one core / node · bond χ / edgeφ_v & I_uvclosed-form Vandermonde
TN-SHAP-G in three steps — a masked graph game becomes a graph-aligned tensor network, read out in closed form for node and edge attributions.
A molecule with each atom coloured by its Shapley attribution.
TN-SHAP-G on a predicted-mutagenic molecule: each atom is coloured by its Shapley attribution. (From the TN-SHAP-G / ICML 2026 materials.)

5. TN-SHAP-Q — the quantum setting

TN-SHAP-Q is the sharpest case, because here the multilinear structure is not fitted — it is exact. A single-frequency, angle-encoded quantum neural network (a single $R_Y$ rotation per feature) is exactly multilinear in the lifted features

$$\phi(x_i) = [\,1,\ \cos x_i,\ \sin x_i\,],$$

the single-frequency case of the finite Fourier structure of such circuits (Schuld–Sweke–Meyer). When you mask features (or remove gates) against a baseline, the multilinear extension of the induced game is the very function the circuit already computes — no surrogate required.

Feature and gate attribution. First-order Shapley values are then the Owen diagonal integral

$$\phi_i = \int_0^1 \partial_i F(t\,\mathbf{1})\, dt,$$

whose integrand is a polynomial of degree at most $P-1$ in $t$ (for $P$ players), so $M$-point Gauss–Legendre quadrature is exact once $M \ge \lceil P/2 \rceil$. That costs $2PM$ circuit evaluations versus $2^P$ for coalition enumeration, and the same construction attributes gates as well as features. Order-$k$ interactions come from the same extension, with a threshold that decreases with interaction order.

Query cost versus dimension: the Owen-integral method grows polynomially while exact coalition enumeration grows as 2 to the P.
Query cost versus problem size: the Owen-integral route uses $2PM$ circuit evaluations, against $2^P$ for exact coalition enumeration. (From the public TN-SHAP-Q demo.)
Exactness

Because the QNN really is multilinear, TN-SHAP-Q's attributions agree with exact $2^P$ coalition enumeration to machine precision (about $10^{-16}$) across the demo's experiments, and its finite-shot estimates are unbiased in expectation with variance falling as $1/N$ shots.

6. Why this is one research program

The three projects look like they belong to different fields — tabular ML, graph learning, quantum computing — but they are the same idea seen three times.

ProjectDomainThe multilinear objectHow Shapley is computed
TN-SHAPTabular modelsFitted tensor-tree / MPS surrogateContractions on the surrogate, orders 1–2
TN-SHAP-GGraph modelsGraph-aligned tensor networkDeterministic diagonal integral (Chebyshev / Vandermonde)
TN-SHAP-QQuantum neural netsExact — the circuit itselfOwen integral by Gauss–Legendre, exact at $M \ge \lceil P/2\rceil$

The backbone is identical: coalition game → multilinear extension → a compact tensor object → contractions for values and interactions. What changes is only how that tensor object is obtained — fitted, graph-aligned, or exact — and which players it acts on — features, nodes, or gates. In every case the payoff is the same: scalable attribution and structured interactions, with interpretability recast as multilinear analysis.

7. Open directions

The multilinear lens does not stop at attribution. A few directions I am pursuing pick up the same interaction-centric view for interpretability and AI safety:

  • Multilinear steering — extending linear probes with low-rank quadratic (bilinear) terms to read and steer concepts that live in interactions between directions, not single directions.
  • Evaluation awareness — measuring whether language models behave differently when they believe they are being tested, in behaviour and in activation space.
  • Honesty and interaction effects — asking whether safety-relevant concepts such as honesty are encoded as interactions, where a purely linear probe would miss them.
  • AI-safety monitoring — turning scalable attribution and interaction indices into auditing tools for deployed models and agents.

8. References & links

The TN-SHAP series

  • TN-SHAPTractable Shapley Values and Interactions via Tensor Networks. F. Heidari, C. Li, G. Rabusseau. Accepted, AISTATS 2026. arXiv:2510.22138 · code · poster · project page
  • TN-SHAP-GGraph-Structured Tensor Network Surrogates for Shapley Values and Interactions. F. Heidari, G. Rabusseau. Accepted, ICML 2026. arXiv:2606.01540 · code · poster · project page
  • TN-SHAP-QExact Multilinear Extensions of Quantum Neural Networks for Shapley Attribution. Manuscript / QTML abstract. code & demo · project page

Background

  • L. S. Shapley. A Value for n-Person Games. 1953.
  • G. Owen. Multilinear Extensions of Games. Management Science, 1972.
  • M. Grabisch & M. Roubens. An axiomatic approach to the concept of interaction among players in cooperative games. IJGT, 1999.
  • M. Schuld, R. Sweke, J. J. Meyer. Effect of data encoding on the expressive power of variational quantum-machine-learning models. Phys. Rev. A, 2021.