Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ai 85% Match Research Paper AI researchers,ML engineers,Data scientists,AI ethicists,Regulators 2 weeks ago

RAISE: A Unified Framework for Responsible AI Scoring and Evaluation

ai-safety › robustness
📄 Abstract

Abstract: As AI systems enter high-stakes domains, evaluation must extend beyond predictive accuracy to include explainability, fairness, robustness, and sustainability. We introduce RAISE (Responsible AI Scoring and Evaluation), a unified framework that quantifies model performance across these four dimensions and aggregates them into a single, holistic Responsibility Score. We evaluated three deep learning models: a Multilayer Perceptron (MLP), a Tabular ResNet, and a Feature Tokenizer Transformer, on structured datasets from finance, healthcare, and socioeconomics. Our findings reveal critical trade-offs: the MLP demonstrated strong sustainability and robustness, the Transformer excelled in explainability and fairness at a very high environmental cost, and the Tabular ResNet offered a balanced profile. These results underscore that no single model dominates across all responsibility criteria, highlighting the necessity of multi-dimensional evaluation for responsible model selection. Our implementation is available at: https://github.com/raise-framework/raise.
Authors (2)
Loc Phuc Truong Nguyen
Hung Thanh Do
Submitted
October 21, 2025
arXiv Category
cs.LG
arXiv PDF

Key Contributions

Introduces RAISE, a unified framework for quantifying AI model performance across explainability, fairness, robustness, and sustainability, aggregating these into a single Responsibility Score. This framework is crucial for selecting AI models in high-stakes domains by providing a holistic view beyond just predictive accuracy.

Business Value

Enables organizations to make more informed and ethical decisions when deploying AI in critical sectors like finance and healthcare, reducing risks associated with biased or unreliable AI systems.