Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
π Abstract
Abstract: While recent sound event detection (SED) systems can identify baleen whale
calls in marine audio, challenges related to false positive and minority-class
detection persist. We propose the boundary proposal network (BPN), which
extends an existing lightweight SED system. The BPN is inspired by work in
image object detection and aims to reduce the number of false positive
detections. It achieves this by using intermediate latent representations
computed within the backbone classification model to gate the final output.
When added to an existing SED system, the BPN achieves a 16.8 % absolute
increase in precision, as well as 21.3 % and 9.4 % improvements in the F1-score
for minority-class d-calls and bp-calls, respectively. We further consider two
approaches to the selection of post-processing hyperparameters: a
forward-search and a backward-search. By separately optimising event-level and
frame-level hyperparameters, these two approaches lead to considerable
performance improvements over parameters selected using empirical methods. The
complete WhaleVAD-BPN system achieves a cross-validated development F1-score of
0.475, which is a 9.8 % absolute improvement over the baseline.
Authors (3)
Christiaan M. Geldenhuys
GΓΌnther Tonitz
Thomas R. Niesler
Submitted
October 24, 2025
Key Contributions
This paper introduces the Boundary Proposal Network (BPN) for sound event detection, inspired by image object detection, to reduce false positives and improve minority class detection in baleen whale calls. Combined with optimized post-processing, it significantly boosts precision and F1-scores, enhancing marine acoustic monitoring capabilities.
Business Value
More accurate monitoring of marine life is crucial for conservation efforts, environmental impact assessments, and understanding marine ecosystems, supporting regulatory compliance and research.