Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cl 92% Match Research Paper LLM Developers,Prompt Engineers,AI Product Managers,Software Engineers 4 weeks ago

What Prompts Don't Say: Understanding and Managing Underspecification in LLM Prompts

large-language-models › evaluation
📄 Abstract

Abstract: Prompt underspecification is a common challenge when interacting with LLMs. In this paper, we present an in-depth analysis of this problem, showing that while LLMs can often infer unspecified requirements by default (41.1%), such behavior is fragile: Under-specified prompts are 2x as likely to regress across model or prompt changes, sometimes with accuracy drops exceeding 20%. This instability makes it difficult to reliably build LLM applications. Moreover, simply specifying all requirements does not consistently help, as models have limited instruction-following ability and requirements can conflict. Standard prompt optimizers likewise provide little benefit. To address these issues, we propose requirements-aware prompt optimization mechanisms that improve performance by 4.8% on average over baselines. We further advocate for a systematic process of proactive requirements discovery, evaluation, and monitoring to better manage prompt underspecification in practice.

Key Contributions

This paper analyzes prompt underspecification in LLMs, showing its fragility and negative impact on application reliability (2x regression likelihood, >20% accuracy drops). It proposes requirements-aware prompt optimization mechanisms that improve performance by 4.8% and advocates for a systematic process for managing underspecification.

Business Value

Enables the development of more robust and reliable LLM-powered applications, reducing development costs and improving user experience. It helps businesses build dependable AI solutions.