Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: The fine-tuning of pre-trained models has become ubiquitous in generative AI,
computer vision, and robotics. Although much attention has been paid to
improving the efficiency of fine-tuning model, there has been less scholarship
around fine-tuning specifically for improved model performance. To remedy this
gap, we present PROFIT, one of the first optimizers designed to incrementally
fine-tune converged models on new tasks and/or datasets. Unlike traditional
optimizers such as SGD or Adam, which make minimal assumptions due to random
initializations, PROFIT takes the properties of a converged model into account
explicitly to regularize the optimization process. Employing a temporal
gradient-orthogonalization process, PROFIT outperforms fine-tuning methods in
various tasks, from image classification to multimodal language model training
to large-scale motion prediction. Moreover, PROFIT is encapsulated as a modular
optimizer, which makes it easy to integrate directly into any training pipeline
with minimal engineering effort.
Authors (7)
Anirudh S Chakravarthy
Shuai Kyle Zheng
Xin Huang
Sachithra Hemachandra
Xiao Zhang
Yuning Chai
+1 more
Submitted
December 2, 2024
Key Contributions
PROFIT is a novel optimizer specifically designed for incremental fine-tuning of converged models. It addresses the gap in research focusing on performance improvement during fine-tuning, unlike traditional optimizers, by explicitly considering the properties of a converged model and employing a temporal gradient-orthogonalization process.
Business Value
Enables more efficient and effective deployment of pre-trained models for specific downstream tasks, leading to better performance in applications like image recognition, language generation, and robotics control.