Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: This work introduces Structured Linear Controlled Differential Equations
(SLiCEs), a unifying framework for sequence models with structured,
input-dependent state-transition matrices that retain the maximal expressivity
of dense matrices whilst being cheaper to compute. The framework encompasses
existing architectures, such as input-dependent block-diagonal linear recurrent
neural networks and DeltaNet's diagonal-plus-low-rank structure, as well as two
novel variants based on sparsity and the Walsh-Hadamard transform. We prove
that, unlike the diagonal state-transition matrices of S4D and Mamba, SLiCEs
employing block-diagonal, sparse, or Walsh-Hadamard matrices match the maximal
expressivity of dense matrices. Empirically, SLiCEs solve the $A_5$
state-tracking benchmark with a single layer, achieve best-in-class length
generalisation on regular language tasks among parallel-in-time models, and
match the performance of log neural controlled differential equations on six
multivariate time-series classification datasets while cutting the average time
per training step by a factor of twenty.
Authors (5)
Benjamin Walker
Lingyi Yang
Nicola Muca Cirone
Cristopher Salvi
Terry Lyons
Proceedings of the Thirty-Ninth Annual Conference on Neural
Information Processing Systems, 2025
Key Contributions
This paper introduces Structured Linear CDEs (SLiCEs), a unifying framework for sequence models that uses structured, input-dependent state-transition matrices. SLiCEs achieve maximal expressivity comparable to dense matrices while being computationally cheaper. The framework encompasses existing architectures and introduces novel variants, proving that block-diagonal, sparse, or Walsh-Hadamard matrices match dense matrix expressivity, unlike diagonal matrices used in models like Mamba.
Business Value
Offers more efficient and expressive models for sequential data, leading to better performance in tasks like time series forecasting, natural language understanding, and control systems. This can improve accuracy and reduce computational costs.