AIPapers.ai - AI Research Papers Daily

Your AI Papers Research Assistant

Today's AI Safety & Ethics Research Top Papers

Wednesday, November 5, 2025

📊 Read Full Intelligence Reports:

Academic Research

Intelligence

Productization & Investment

Intelligence

An Automated Framework for Strategy Discovery, Retrieval, and Evolution in LLM Jailbreak Attacks

Proposes an automated framework that discovers, retrieves, and evolves jailbreak strategies for LLMs by extracting information from failed attacks. Demonstrates a strategy that evades current defenses and self-evolves, enhancing LLM security research.

The Realignment Problem: When Right becomes Wrong in LLMs

Identifies and addresses the 'Alignment-Reality Gap' in LLMs, where models misalign with evolving norms. Proposes a framework to update LLMs efficiently without costly re-annotation, aiming for more reliable long-term use.

ValueCompass: A Framework for Measuring Contextual Value Alignment Between Human and LLMs

Introduces ValueCompass, a framework grounded in psychological theory for measuring contextual value alignment between humans and LLMs. Enables systematic assessment of AI alignment with diverse individual and societal values.

I Want to Break Free! Persuasion and Anti-Social Behavior of LLMs in Multi-Agent Settings with Social Hierarchy

Analyzes persuasion and anti-social behavior of LLM agents in hierarchical multi-agent settings. Investigates emergent phenomena and potential risks through simulated interactions, offering insights into AI agent behavior.

CytoNet: A Foundation Model for the Human Cerebral Cortex

Introduces CytoNet, a foundation model encoding high-resolution cerebral cortex images into expressive features using self-supervised learning. Enables comprehensive brain analyses by capturing cellular architecture and spatial proximity.

Feature compression is the root cause of adversarial fragility in neural network classifiers

Explains adversarial fragility in neural networks by identifying feature compression as the root cause. Provides a matrix-theoretic explanation showing how robustness degrades with input compression.

MammoClean: Toward Reproducible and Bias-Aware AI in Mammography through Dataset Harmonization

Presents MammoClean, a framework for standardizing mammography datasets and quantifying biases. Addresses heterogeneity in data quality and metadata to improve generalizability and clinical deployment of AI models.

LiveSecBench: A Dynamic and Culturally-Relevant AI Safety Benchmark for LLMs in Chinese Context

Proposes LiveSecBench, a dynamic benchmark for Chinese-context LLM safety evaluation. Covers legality, ethics, factuality, privacy, adversarial robustness, and reasoning safety rooted in Chinese frameworks.

Sort by:

Loading more papers...

📚 You've reached the end of the papers list

Today's AI Safety & Ethics Research Top Papers

Weekly AI Safety & Ethics Research Top Papers

Weekly Executive Briefing

Monday, November 3, 2025

Tuesday, November 4, 2025

Wednesday, November 5, 2025