arxiv_cv 95% Match Research Paper Computer Vision Researchers,Machine Learning Engineers,AI Ethicists 1 week ago

Unbiased Scene Graph Generation from Biased Training

computer-vision › scene-understanding

📄 Abstract

Abstract: Today's scene graph generation (SGG) task is still far from practical, mainly due to the severe training bias, e.g., collapsing diverse "human walk on / sit on / lay on beach" into "human on beach". Given such SGG, the down-stream tasks such as VQA can hardly infer better scene structures than merely a bag of objects. However, debiasing in SGG is not trivial because traditional debiasing methods cannot distinguish between the good and bad bias, e.g., good context prior (e.g., "person read book" rather than "eat") and bad long-tailed bias (e.g., "near" dominating "behind / in front of"). In this paper, we present a novel SGG framework based on causal inference but not the conventional likelihood. We first build a causal graph for SGG, and perform traditional biased training with the graph. Then, we propose to draw the counterfactual causality from the trained graph to infer the effect from the bad bias, which should be removed. In particular, we use Total Direct Effect (TDE) as the proposed final predicate score for unbiased SGG. Note that our framework is agnostic to any SGG model and thus can be widely applied in the community who seeks unbiased predictions. By using the proposed Scene Graph Diagnosis toolkit on the SGG benchmark Visual Genome and several prevailing models, we observed significant improvements over the previous state-of-the-art methods.

Authors (5)

Kaihua Tang

Yulei Niu

Jianqiang Huang

Jiaxin Shi

Hanwang Zhang

Submitted

February 27, 2020

arXiv Category

cs.CV

arXiv PDF

Key Contributions

This paper introduces a novel framework for unbiased scene graph generation (SGG) by leveraging causal inference. It addresses the critical issue of training bias in SGG, which hinders downstream tasks like VQA, by distinguishing between beneficial context priors and detrimental long-tailed biases. The proposed method uses a causal graph to perform biased training and then applies counterfactual causality (Total Direct Effect) to remove the negative effects of bad bias, leading to more accurate scene structure inference.

Business Value

Improved accuracy in image understanding systems can lead to better performance in applications like autonomous driving, content moderation, and visual search, by enabling more reliable interpretation of visual scenes.

Paper Metadata

Innovation Type

Algorithmic

Deployment Feasibility

Moderate. Requires careful implementation of causal inference techniques and may need significant computational resources for training.

Limitations Addressed

Severe training bias in SGG,Inability of traditional debiasing methods to distinguish good/bad bias,Poor performance of downstream tasks due to biased SGG

Technical Tags

scene graph generationcausal inferencedebiasingcounterfactual causalitytotal direct effectlong-tailed biascontext priorvisual question answering

Research Topics

Computer VisionCausal InferenceMachine LearningData BiasScene Understanding

Methods & Architectures

Causal Graph ConstructionCounterfactual InferenceTotal Direct Effect (TDE)

Applications & Tasks

Image Understanding Robotics Autonomous Systems Bias MitigationData ImbalanceScene Graph Generation Scene Graph GenerationVisual Question Answering

Related Fields

Causal Machine LearningComputer VisionNatural Language ProcessingRobotics

Keywords

scene graph generationcausal inferencedebiasinglong-tailed distributionvisual understandingcounterfactual reasoningbias mitigationdeep learningimage analysisVQA

Academic Context

#Computer Vision#Causal Inference#Machine Learning#Data Bias#Scene Understanding

Commercial Potential

Potential Products

Enhanced image recognition APIsSmarter visual search enginesMore capable autonomous driving perception systems

Target Industries

TechnologyAutomotiveE-commerceSecurity

Use Case Examples

Accurate scene understanding for autonomous vehiclesImproved object relationship detection for image retrievalMore robust visual question answering systems

Competitive Edge

Offers a novel causal inference approach to address SGG bias, potentially outperforming traditional debiasing methods by explicitly modeling causal relationships.

Market Opportunity

Growing market for AI-powered image analysis and understanding.

Revenue Models

Licensing of technologyAPI services.

Resource Requirements

Compute Needs

High (for training deep learning models with causal inference)

Data Requirements

Large-scale image datasets with scene graph annotations.

Deployment Constraints

Computational cost,Need for accurate causal graph construction

Scalability

Scalability depends on the efficiency of the causal inference algorithms and the underlying SGG model.

Production Readiness

Maturity Level

Research

Time to Market

1-3 years

Patent Potential

Low to Moderate (algorithmic innovation)

View Full Paper Back to Papers