arxiv_ai 95% Match Research Paper Traffic safety researchers,Automotive safety engineers,Insurance analysts,Data scientists in transportation 3 weeks ago

From Narratives to Probabilistic Reasoning: Predicting and Interpreting Drivers' Hazardous Actions in Crashes Using Large Language Model

large-language-models › reasoning

📄 Abstract

Abstract: Vehicle crashes involve complex interactions between road users, split-second decisions, and challenging environmental conditions. Among these, two-vehicle crashes are the most prevalent, accounting for approximately 70% of roadway crashes and posing a significant challenge to traffic safety. Identifying Driver Hazardous Action (DHA) is essential for understanding crash causation, yet the reliability of DHA data in large-scale databases is limited by inconsistent and labor-intensive manual coding practices. Here, we present an innovative framework that leverages a fine-tuned large language model to automatically infer DHAs from textual crash narratives, thereby improving the validity and interpretability of DHA classifications. Using five years of two-vehicle crash data from MTCF, we fine-tuned the Llama 3.2 1B model on detailed crash narratives and benchmarked its performance against conventional machine learning classifiers, including Random Forest, XGBoost, CatBoost, and a neural network. The fine-tuned LLM achieved an overall accuracy of 80%, surpassing all baseline models and demonstrating pronounced improvements in scenarios with imbalanced data. To increase interpretability, we developed a probabilistic reasoning approach, analyzing model output shifts across original test sets and three targeted counterfactual scenarios: variations in driver distraction and age. Our analysis revealed that introducing distraction for one driver substantially increased the likelihood of "General Unsafe Driving"; distraction for both drivers maximized the probability of "Both Drivers Took Hazardous Actions"; and assigning a teen driver markedly elevated the probability of "Speed and Stopping Violations." Our framework and analytical methods provide a robust and interpretable solution for large-scale automated DHA detection, offering new opportunities for traffic safety analysis and intervention.

Authors (9)

Boyou Chen

Gerui Xu

Zifei Wang

Huizhong Guo

Ananna Ahmed

Zhaonan Sun

+3 more

Submitted

October 14, 2025

arXiv Category

cs.AI

arXiv PDF

Key Contributions

Presents a framework using a fine-tuned LLM (Llama 3.2 1B) to automatically infer Driver Hazardous Actions (DHAs) from textual crash narratives, improving data validity and interpretability. It benchmarks this approach against traditional ML classifiers for two-vehicle crashes.

Business Value

Enhances traffic safety by providing more accurate and reliable data on crash causes, enabling better prevention strategies, insurance risk assessment, and policy making.

Paper Metadata

Innovation Type

Application/Algorithmic

Deployment Feasibility

High. Fine-tuning LLMs is a common practice. The framework can be integrated into existing accident analysis workflows.

Limitations Addressed

Limited reliability and labor-intensive nature of manual DHA coding,Inconsistent classification of DHAs,Difficulty in extracting nuanced causal information from text

Performance Gains

Improves the validity and interpretability of DHA classifications compared to conventional machine learning classifiers.

Technical Tags

driver hazardous actioncrash narrativeslarge language modelprobabilistic reasoningtextual analysisfine-tuninginterpretabilitytwo-vehicle crashescrash causationLlama 3.2 1B

Research Topics

Traffic SafetyCausality AnalysisNatural Language ProcessingMachine Learning Applications

Methods & Architectures

Fine-tuned Large Language Model (LLM)Textual AnalysisProbabilistic InferenceBenchmarking against ML classifiers Llama 3.2 1B (fine-tuned)

Applications & Tasks

Traffic Safety Analysis Automotive Industry Insurance Limited Reliability of DHA DataInconsistent Manual CodingUnderstanding Crash Causation Automatically inferring Driver Hazardous Actions (DHA) from crash narrativesImproving validity and interpretability of DHA classificationsPredicting hazardous actions in crashes

Datasets & Benchmarks

Datasets

MTCF (two-vehicle crash data)

Benchmarks

Random Forest • XGBoost • CatBoost

AccuracyInterpretabilityValidity of DHA classificationsPerformance comparison with traditional ML classifiers

Related Fields

Natural Language ProcessingTraffic SafetyMachine LearningData AnalysisAutomotive Engineering

Keywords

traffic safetyLLMdriver behaviorcrash analysishazardous actionnatural language processingtext analysiscausationprobabilistic reasoningautomotiveinsurance

Academic Context

#Traffic Safety#Causality Analysis#Natural Language Processing#Machine Learning Applications

Technology Stack

Frameworks & Libraries

Llama 3.2 1B

Commercial Potential

Potential Products

Automated crash analysis softwareTraffic safety data analytics platformsInsurance risk assessment tools

Target Industries

AutomotiveInsuranceTransportationGovernment (Transportation Departments)

Use Case Examples

Automatically classifying driver errors from police reportsIdentifying common hazardous actions leading to specific crash typesProviding insights for developing targeted safety campaigns

Competitive Edge

Leverages the advanced NLP capabilities of LLMs to automate and improve the accuracy and interpretability of driver hazardous action classification, outperforming traditional ML methods on textual data.

Market Opportunity

Significant market for automotive safety solutions, data analytics, and insurance technology.

Revenue Models

Licensing of the analysis toolproviding data analytics services to insurance companies and automotive manufacturers.

Resource Requirements

Compute Needs

Moderate, for fine-tuning and inference of the LLM.

Data Requirements

Large corpus of textual crash narratives with associated ground truth labels (if available for training/evaluation).

Deployment Constraints

Accuracy depends on the quality and detail of crash narratives,Potential for LLM biases,Need for domain-specific fine-tuning

Scalability

Scalable to large volumes of crash data, limited primarily by computational resources for LLM processing.

Regulatory Considerations

Can inform transportation safety regulations and standards.

Production Readiness

Maturity Level

Research/Development

Time to Market

1-2 years for integration into industry workflows.

Patent Potential

Moderate, for the novel application of LLMs to crash narrative analysis and the specific fine-tuning methodology.

View Full Paper Back to Papers