Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cl 92% Match Research Paper LLM Developers (especially in China),AI Safety Researchers,Regulators,AI Ethicists 19 hours ago

LiveSecBench: A Dynamic and Culturally-Relevant AI Safety Benchmark for LLMs in Chinese Context

ai-safety › robustness
📄 Abstract

Abstract: In this work, we propose LiveSecBench, a dynamic and continuously updated safety benchmark specifically for Chinese-language LLM application scenarios. LiveSecBench evaluates models across six critical dimensions (Legality, Ethics, Factuality, Privacy, Adversarial Robustness, and Reasoning Safety) rooted in the Chinese legal and social frameworks. This benchmark maintains relevance through a dynamic update schedule that incorporates new threat vectors, such as the planned inclusion of Text-to-Image Generation Safety and Agentic Safety in the next update. For now, LiveSecBench (v251030) has evaluated 18 LLMs, providing a landscape of AI safety in the context of Chinese language. The leaderboard is publicly accessible at https://livesecbench.intokentech.cn/.

Key Contributions

This paper introduces LiveSecBench, a dynamic and continuously updated AI safety benchmark specifically tailored for Chinese-language LLM applications. It evaluates models across six critical dimensions (Legality, Ethics, Factuality, Privacy, Adversarial Robustness, Reasoning Safety) grounded in Chinese legal and social frameworks, ensuring relevance through dynamic updates and incorporating emerging threat vectors like text-to-image and agentic safety.

Business Value

Enables developers and deployers of LLMs in China to ensure compliance with local regulations and societal norms, reducing risks and building user trust.