arxiv_cv 90% Match Research Paper AI researchers,Developers of generative models,AI ethicists,Policy makers 2 weeks ago

Exposing Blindspots: Cultural Bias Evaluation in Generative Image Models

computer-vision › scene-understanding

📄 Abstract

Abstract: Generative image models produce striking visuals yet often misrepresent culture. Prior work has examined cultural bias mainly in text-to-image (T2I) systems, leaving image-to-image (I2I) editors underexplored. We bridge this gap with a unified evaluation across six countries, an 8-category/36-subcategory schema, and era-aware prompts, auditing both T2I generation and I2I editing under a standardized protocol that yields comparable diagnostics. Using open models with fixed settings, we derive cross-country, cross-era, and cross-category evaluations. Our framework combines standard automatic metrics, a culture-aware retrieval-augmented VQA, and expert human judgments collected from native reviewers. To enable reproducibility, we release the complete image corpus, prompts, and configurations. Our study reveals three findings: (1) under country-agnostic prompts, models default to Global-North, modern-leaning depictions that flatten cross-country distinctions; (2) iterative I2I editing erodes cultural fidelity even when conventional metrics remain flat or improve; and (3) I2I models apply superficial cues (palette shifts, generic props) rather than era-consistent, context-aware changes, often retaining source identity for Global-South targets. These results highlight that culture-sensitive edits remain unreliable in current systems. By releasing standardized data, prompts, and human evaluation protocols, we provide a reproducible, culture-centered benchmark for diagnosing and tracking cultural bias in generative image models.

Authors (11)

Huichan Seo

Sieun Choi

Minki Hong

Yi Zhou

Junseo Kim

Lukman Ismaila

+5 more

Submitted

October 22, 2025

arXiv Category

cs.CV

arXiv PDF

Key Contributions

Introduces a unified evaluation framework to assess cultural bias in both text-to-image (T2I) and image-to-image (I2I) generative models across six countries and different eras. The framework combines automatic metrics, a culture-aware VQA system, and expert human judgments, revealing that models often default to Global-North, modern depictions.

Business Value

Helps developers and users understand and mitigate cultural biases in generative AI, leading to more equitable and representative AI-generated content, crucial for global brand consistency and ethical AI deployment.

Paper Metadata

Innovation Type

Evaluation Framework and Methodology

Deployment Feasibility

High for the evaluation framework itself. Deployment of bias mitigation strategies within generative models is an ongoing research challenge.

Limitations Addressed

Prior work focused mainly on T2I systems, leaving I2I editors underexplored,Lack of standardized protocols for cross-country and cross-era bias evaluation,Difficulty in obtaining comparable diagnostics across different generative models and tasks

Technical Tags

generative image modelscultural biastext-to-imageimage-to-image editingevaluation frameworkcross-country analysisera-aware promptsretrieval-augmented VQAhuman judgmentopen models

Research Topics

AI EthicsGenerative ModelsComputer VisionNatural Language ProcessingFairness and Bias

Methods & Architectures

Unified evaluation frameworkCross-country auditingEra-aware promptingRetrieval-augmented Visual Question Answering (VQA)Expert human judgments Text-to-Image (T2I) modelsImage-to-Image (I2I) editing models

Applications & Tasks

Content Generation Digital Art Media Production AI Ethics Research Cultural Bias DetectionModel EvaluationFairness AssessmentGenerative Model Auditing Evaluating cultural bias in T2I and I2I modelsAuditing generative models across countries and erasDeveloping standardized protocols for bias assessment

Related Fields

AI EthicsFairness in AINatural Language ProcessingComputer VisionGenerative Models

Keywords

generative modelscultural biastext-to-imageimage editingevaluationfairnessAI ethicscross-culturalera-awareVQAhuman evaluationopen modelsbias mitigationrepresentationglobal north

Academic Context

#AI Ethics#Generative Models#Computer Vision#Natural Language Processing#Fairness and Bias

Commercial Potential

Potential Products

Bias auditing tools for generative AIFrameworks for ethical AI developmentDatasets for bias research

Target Industries

TechnologyMediaAdvertisingGamingCreative Industries

Use Case Examples

Assessing if an image generator produces culturally diverse outputsIdentifying biases in AI-generated marketing materialsDeveloping guidelines for responsible AI image generation

Competitive Edge

Provides a more comprehensive and standardized approach to evaluating cultural bias in generative models compared to previous, more limited studies.

Market Opportunity

Growing market for AI ethics and responsible AI tools.

Revenue Models

Consulting services for bias auditinglicensing of evaluation tools.

Resource Requirements

Compute Needs

Moderate for running evaluation scripts and models; potentially high for training/fine-tuning generative models.

Data Requirements

Requires access to generative models, prompts, and potentially curated datasets for VQA and human evaluation.

Deployment Constraints

Subjectivity in human evaluation,Scalability of comprehensive auditing,Defining 'cultural appropriateness' universally

Scalability

The evaluation framework is designed to be scalable, but comprehensive human evaluation can be resource-intensive.

Regulatory Considerations

Growing interest in AI regulation regarding fairness and bias.

Production Readiness

Maturity Level

Research/Development

Time to Market

1-3 years for widespread adoption of evaluation standards.

Licensing

Source available release implies permissive licensing for research use.

Patent Potential

Low for the evaluation methodology itself, but potential for IP around bias mitigation techniques.

View Full Paper Back to Papers