Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ml 85% Match Research Paper Machine Learning Researchers,Time Series Analysts,NLP Engineers,Control Systems Engineers 1 week ago

Structured Linear CDEs: Maximally Expressive and Parallel-in-Time Sequence Models

generative-ai › flow-models
📄 Abstract

Abstract: This work introduces Structured Linear Controlled Differential Equations (SLiCEs), a unifying framework for sequence models with structured, input-dependent state-transition matrices that retain the maximal expressivity of dense matrices whilst being cheaper to compute. The framework encompasses existing architectures, such as input-dependent block-diagonal linear recurrent neural networks and DeltaNet's diagonal-plus-low-rank structure, as well as two novel variants based on sparsity and the Walsh-Hadamard transform. We prove that, unlike the diagonal state-transition matrices of S4D and Mamba, SLiCEs employing block-diagonal, sparse, or Walsh-Hadamard matrices match the maximal expressivity of dense matrices. Empirically, SLiCEs solve the $A_5$ state-tracking benchmark with a single layer, achieve best-in-class length generalisation on regular language tasks among parallel-in-time models, and match the performance of log neural controlled differential equations on six multivariate time-series classification datasets while cutting the average time per training step by a factor of twenty.
Authors (5)
Benjamin Walker
Lingyi Yang
Nicola Muca Cirone
Cristopher Salvi
Terry Lyons
Submitted
May 23, 2025
arXiv Category
cs.LG
Proceedings of the Thirty-Ninth Annual Conference on Neural Information Processing Systems, 2025
arXiv PDF

Key Contributions

This paper introduces Structured Linear CDEs (SLiCEs), a unifying framework for sequence models that uses structured, input-dependent state-transition matrices. SLiCEs achieve maximal expressivity comparable to dense matrices while being computationally cheaper. The framework encompasses existing architectures and introduces novel variants, proving that block-diagonal, sparse, or Walsh-Hadamard matrices match dense matrix expressivity, unlike diagonal matrices used in models like Mamba.

Business Value

Offers more efficient and expressive models for sequential data, leading to better performance in tasks like time series forecasting, natural language understanding, and control systems. This can improve accuracy and reduce computational costs.