Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Low-Rank Adaptation (LoRA) enables efficient fine-tuning of large language
models but suffers from catastrophic forgetting when learned updates interfere
with the dominant singular directions that encode essential pre-trained
knowledge. We propose Orthogonal Projection LoRA (OPLoRA), a theoretically
grounded approach that prevents this interference through double-sided
orthogonal projections. By decomposing frozen weights via SVD, OPLoRA
constrains LoRA updates to lie entirely within the orthogonal complement of the
top-$k$ singular subspace using projections $P_L = I - U_k U_k^\top$ and $P_R =
I - V_k V_k^\top$. We prove that this construction exactly preserves the
top-$k$ singular triples, providing mathematical guarantees for knowledge
retention. To quantify subspace interference, we introduce $\rho_k$, a metric
measuring update alignment with dominant directions. Extensive experiments
across commonsense reasoning, mathematics, and code generation demonstrate that
OPLoRA significantly reduces forgetting while maintaining competitive
task-specific performance on LLaMA-2 7B and Qwen2.5 7B, establishing orthogonal
projection as an effective mechanism for knowledge preservation in
parameter-efficient fine-tuning.
Submitted
October 14, 2025
Key Contributions
Proposes Orthogonal Projection LoRA (OPLoRA), a theoretically grounded method to prevent catastrophic forgetting during parameter-efficient fine-tuning. OPLoRA uses double-sided orthogonal projections based on SVD to constrain LoRA updates, mathematically guaranteeing preservation of top-k singular triples and thus essential pre-trained knowledge. Experiments show OPLoRA significantly improves performance across reasoning, math, and code generation tasks.
Business Value
Enables more stable and reliable fine-tuning of LLMs for specific tasks, reducing the risk of performance degradation and preserving valuable pre-trained capabilities, leading to more robust and efficient model adaptation.