Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cv 85% Match Research Paper Remote sensing analysts,Computer vision researchers,AI ethicists,Geospatial data scientists 2 months ago

Checkmate: interpretable and explainable RSVQA is the endgame

computer-vision › scene-understanding
📄 Abstract

Abstract: Remote Sensing Visual Question Answering (RSVQA) presents unique challenges in ensuring that model decisions are both understandable and grounded in visual content. Current models often suffer from a lack of interpretability and explainability, as well as from biases in dataset distributions that lead to shortcut learning. In this work, we tackle these issues by introducing a novel RSVQA dataset, Chessboard, designed to minimize biases through 3'123'253 questions and a balanced answer distribution. Each answer is linked to one or more cells within the image, enabling fine-grained visual reasoning. Building on this dataset, we develop an explainable and interpretable model called Checkmate that identifies the image cells most relevant to its decisions. Through extensive experiments across multiple model architectures, we show that our approach improves transparency and supports more trustworthy decision-making in RSVQA systems.

Key Contributions

Introduces the Chessboard dataset for RSVQA, designed to minimize bias with a balanced answer distribution and linked image cells for fine-grained reasoning. Develops the Checkmate model, which enhances interpretability and explainability by identifying relevant image cells for its decisions, leading to more trustworthy RSVQA systems.

Business Value

Enables more reliable and understandable analysis of remote sensing imagery, crucial for applications like environmental monitoring, urban planning, and disaster response where understanding model reasoning is critical.