Dominic Rigby

Ultimate Guide to Fine-Tuning

Date: 1st June 2p25 arXiv Link: (Blog post / guide, not a formal paper)

Comprehensive overview of fine‐tuning techniques for large language models across various domains.
Stresses the critical importance of data management: cleaning, deduplication, diversity, representativeness, and stratification.
Surveys many popular fine‐tuning methods—LoRA, adapter‐based tuning, full‐parameter fine‐tuning, prompt tuning, etc. and compares their trade‐offs.
Emphasizes that fine‐tuning is not just an optimization problem but a data curation and pipeline engineering challenge.

LoRA (Low-Rank Adaptation): only train low‐rank update matrices on top of frozen pre‐trained weights to save compute/memory.
Adapter Tuning: insert small “adapter” modules between model layers that are the only parameters updated.
Prompt Tuning & Prefix Tuning: keep the backbone frozen; prepend trainable tokens/embeddings to prompts to steer behavior.
Data‐centric best practices:
- Deduplication: remove near duplicates to reduce overfitting.
- Augmentation: generate paraphrases/back‐translations to boost tail examples.
- Stratified sampling: ensure balanced representation of classes/tasks.
- Continual monitoring: track out‐of‐distribution (OOD) drift and retrain as needed.