Fine-tune to change how the model behaves (style, format, skill); use retrieval (RAG) to give it fresh or private knowledge. They are complementary, not competing.

Is fine-tuning expensive?

Full fine-tuning can be, but parameter-efficient methods like LoRA train tiny adapters and make it cheap and fast for most use cases.

Reinforcement learning from human feedback is a fine-tuning stage that uses human preference judgments to make a model more helpful, harmless and honest.

When should I fine-tune?

After prompting and retrieval plateau. If you can solve it with a better prompt or relevant context, do that first — it is cheaper and more flexible.

ConceptsUpdated 2026-06-21 · Version 1.0

What is Fine-tuning?

Fine-tuning continues training a pretrained model on a smaller, targeted dataset to specialize its behavior, style or domain knowledge. It is far cheaper than pretraining and changes the model's weights — unlike prompting or retrieval, which leave the model unchanged. Use it to lock in a consistent format, tone or skill; use retrieval instead when you need fresh or private facts.

Evidence: BenchmarkConfidence: HighSource: BenchmarkSource: Paper

Machine-readable: JSON

Definition

Fine-tuning is the process of further training a pretrained model on a focused dataset to adapt its weights toward a specific behavior, style, format or domain.

Key takeaways

Fine-tuning updates model weights; prompting and RAG do not.
Best for consistent behavior, style or format — not for fresh facts.
Parameter-efficient methods (LoRA) make it cheap and practical.
RLHF is a form of fine-tuning using human preferences.
Default to prompting and retrieval first; fine-tune when they plateau.

Context

A pretrained foundation model is a generalist. Fine-tuning narrows it: shown enough examples of the target behavior, the model internalizes it, so you no longer need to specify it in every prompt.

It is one of three adaptation levers, alongside prompting and retrieval. The art is choosing the right one: fine-tune for how the model should behave, retrieve for what it should know.

Architecture

Full fine-tuning updates all weights — powerful but expensive. Parameter-efficient fine-tuning (PEFT), notably LoRA, trains small adapter weights while freezing the base, capturing most of the benefit at a fraction of the cost.

Instruction tuning and RLHF are specialized fine-tuning stages that turn a raw base model into a helpful, aligned assistant. Quality of the dataset matters far more than its size.

Components

Pretrained base modelCurated training datasetTraining objectivePEFT / LoRA adaptersEvaluation set

Benefits

Bakes in consistent behavior, style or format.
Reduces prompt length and per-call cost.
Can teach narrow skills a base model lacks.
PEFT makes it affordable and fast.

Risks

Does not add fresh or private facts — use retrieval for that.
Risk of catastrophic forgetting or overfitting.
Needs a quality, well-labeled dataset and an eval set.
Couples you to a model version; migration costs on upgrades.

Tools & technologies

LoRA / PEFT librariesProvider fine-tuning APIsRLHF / preference-tuning pipelinesEvaluation suites

Examples

Fine-tuning a model to always output a strict company JSON format.
Teaching a consistent brand voice for generated copy.
Adapting a model to a specialized domain's terminology.

FAQs

Fine-tuning or RAG?: Fine-tune to change how the model behaves (style, format, skill); use retrieval (RAG) to give it fresh or private knowledge. They are complementary, not competing.
Is fine-tuning expensive?: Full fine-tuning can be, but parameter-efficient methods like LoRA train tiny adapters and make it cheap and fast for most use cases.
What is RLHF?: Reinforcement learning from human feedback is a fine-tuning stage that uses human preference judgments to make a model more helpful, harmless and honest.
When should I fine-tune?: After prompting and retrieval plateau. If you can solve it with a better prompt or relevant context, do that first — it is cheaper and more flexible.