Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cv 85% Match Research Paper Researchers in event-based vision,Robotics engineers,Developers of high-speed perception systems 5 days ago

Spiking Patches: Asynchronous, Sparse, and Efficient Tokens for Event Cameras

computer-vision β€Ί 3d-vision
πŸ“„ Abstract

Abstract: We propose tokenization of events and present a tokenizer, Spiking Patches, specifically designed for event cameras. Given a stream of asynchronous and spatially sparse events, our goal is to discover an event representation that preserves these properties. Prior works have represented events as frames or as voxels. However, while these representations yield high accuracy, both frames and voxels are synchronous and decrease the spatial sparsity. Spiking Patches gives the means to preserve the unique properties of event cameras and we show in our experiments that this comes without sacrificing accuracy. We evaluate our tokenizer using a GNN, PCN, and a Transformer on gesture recognition and object detection. Tokens from Spiking Patches yield inference times that are up to 3.4x faster than voxel-based tokens and up to 10.4x faster than frames. We achieve this while matching their accuracy and even surpassing in some cases with absolute improvements up to 3.8 for gesture recognition and up to 1.4 for object detection. Thus, tokenization constitutes a novel direction in event-based vision and marks a step towards methods that preserve the properties of event cameras.
Authors (3)
Christoffer Koo Øhrstrøm
Ronja GΓΌldenring
Lazaros Nalpantidis
Submitted
October 30, 2025
arXiv Category
cs.CV
arXiv PDF

Key Contributions

Proposes 'Spiking Patches', a novel tokenization method for event camera data that preserves the asynchronous and sparse properties of events, unlike frame or voxel representations. This leads to significantly faster inference (up to 10.4x) without sacrificing accuracy in tasks like gesture recognition and object detection when used with GNNs or Transformers.

Business Value

Enables the development of faster, more efficient AI systems for applications using event cameras, such as low-latency robotics, AR/VR, and high-speed tracking, potentially reducing hardware costs and power consumption.