arxiv_ml 90% Match Research Paper Machine Learning Researchers,AI Engineers,NLP Practitioners,Computer Vision Engineers 1 week ago

MoORE: SVD-based Model MoE-ization for Conflict- and Oblivion-Resistant Multi-Task Adaptation

large-language-models › model-architecture

📄 Abstract

Abstract: Adapting large-scale foundation models in multi-task scenarios often suffers from task conflict and oblivion. To mitigate such issues, we propose a novel ''model MoE-ization'' strategy that leads to a conflict- and oblivion-resistant multi-task adaptation method. Given a weight matrix of a pre-trained model, our method applies SVD to it and introduces a learnable router to adjust its singular values based on tasks and samples. Accordingly, the weight matrix becomes a Mixture of Orthogonal Rank-one Experts (MoORE), in which each expert corresponds to the outer product of a left singular vector and the corresponding right one. We can improve the model capacity by imposing a learnable orthogonal transform on the right singular vectors. Unlike low-rank adaptation (LoRA) and its MoE-driven variants, MoORE guarantees the experts' orthogonality and maintains the column space of the original weight matrix. These two properties make the adapted model resistant to the conflicts among the new tasks and the oblivion of its original tasks, respectively. Experiments on various datasets demonstrate that MoORE outperforms existing multi-task adaptation methods consistently, showing its superiority in terms of conflict- and oblivion-resistance. The code of the experiments is available at https://github.com/DaShenZi721/MoORE.

Authors (5)

Shen Yuan

Yin Zheng

Taifeng Wang

Binbin Liu

Hongteng Xu

Submitted

June 17, 2025

arXiv Category

cs.LG

arXiv PDF

Key Contributions

Introduces MoORE, a novel SVD-based 'model MoE-ization' strategy for conflict- and oblivion-resistant multi-task adaptation. It decomposes weight matrices into orthogonal rank-one experts, allowing for improved capacity and resistance to task interference, unlike traditional LoRA methods.

Business Value

Enables more efficient and effective adaptation of large pre-trained models to multiple downstream tasks, reducing the need for task-specific models and mitigating performance degradation.

Paper Metadata

Innovation Type

Algorithmic

Deployment Feasibility

High, as it builds upon existing model adaptation techniques and focuses on improving efficiency and robustness.

Limitations Addressed

Task conflict and task oblivion, common issues in multi-task adaptation of large-scale foundation models, where learning new tasks can degrade performance on previously learned tasks.

Performance Gains

Demonstrates resistance to task conflict and oblivion, leading to improved performance in multi-task adaptation scenarios compared to baselines.

Technical Tags

Mixture of Experts (MoE)Singular Value Decomposition (SVD)Multi-task learningModel adaptationTask conflictTask oblivionOrthogonal expertsLow-rank adaptation (LoRA)

Research Topics

Machine LearningDeep LearningModel AdaptationMulti-task LearningFoundation Models

Methods & Architectures

SVD-based model MoE-izationMixture of Orthogonal Rank-one Experts (MoORE)Learnable routerOrthogonal transformation Mixture of Experts (MoE)Foundation ModelsRank-one matrices

Applications & Tasks

Natural Language Processing Computer Vision General AI Task ConflictTask OblivionMulti-task Adaptation Model AdaptationMulti-task LearningParameter-Efficient Fine-Tuning (PEFT)

Related Fields

Machine LearningDeep LearningNatural Language ProcessingComputer VisionModel CompressionParameter-Efficient Fine-Tuning

Keywords

Mixture of ExpertsSVDmulti-task learningmodel adaptationfoundation modelstask conflicttask oblivionMoOREparameter-efficient fine-tuningLoRAorthogonalrank-one decompositiondeep learning

Academic Context

#Machine Learning#Deep Learning#Model Adaptation#Multi-task Learning#Foundation Models

Commercial Potential

Potential Products

Efficient fine-tuning libraries for large modelsMulti-task adaptation frameworks

Target Industries

TechnologySoftware DevelopmentAI Research

Use Case Examples

Adapting a single large language model for multiple chatbot functionalitiesFine-tuning a vision model for diverse image classification tasks simultaneously

Competitive Edge

Offers a novel approach to multi-task adaptation that explicitly addresses task conflict and oblivion by leveraging SVD and MoE principles, providing a more robust alternative to existing methods like LoRA.

Market Opportunity

Significant market for efficient fine-tuning and multi-task adaptation of large models.

Revenue Models

Licensing of the adaptation techniqueintegration into MLaaS platforms.

Resource Requirements

Compute Needs

Moderate, potentially higher than standard fine-tuning due to MoE routing, but aims for efficiency.

Data Requirements

Requires datasets for multiple tasks for adaptation.

Deployment Constraints

Integration into existing model serving infrastructure.

Scalability

Scales with the number of experts and the size of the base model.

Production Readiness

Maturity Level

Research

Time to Market

1-3 years

Patent Potential

Moderate, for the MoORE methodology.

View Full Paper Back to Papers