Video tutorial · ~7 minutes · narrated & subtitled

A guided tour of SubspacePath

The problem, the idea, and the evidence — animated from the paper's own results. How to specialize a frozen LLM for one scenario at inference time, with no scenario training.

In this tour you'll see

What the tutorial covers

1

Why global pruning is brittle. One static importance ranking is an average over everything, so under a new scenario the wrong heads get cut.

2

Subspace–pathway coupling. DBS builds near-orthogonal domain axes; PSP maps them — via probes, head importance, and a whitelist — to a budgeted head mask.

3

That it works and is cheap. Pruned beats dense on Qwen2.5-14B (47.8 / 44.1 / 31.3), with online compilation in 0.027–0.068s, reused every turn.