ConceptsUpdated 2026-06-21 · Version 1.0

What are Foundation Models?

A foundation model is a large model pretrained on broad data at scale that can be adapted to a wide range of downstream tasks. Large language models (LLMs) and large multimodal models are the canonical examples. The term, coined at Stanford in 2021, captures a shift: instead of training a bespoke model per task, organizations build on a shared, general-purpose base — then specialize it through prompting, retrieval or fine-tuning.

Evidence: BenchmarkConfidence: HighSource: BenchmarkSource: Paper

Definition

A foundation model is a large, general-purpose model pretrained on broad data that serves as a base which can be adapted — via prompting, retrieval or fine-tuning — to many downstream tasks.

Key takeaways

  • Foundation models are general bases adapted to many tasks.
  • LLMs and multimodal models are the leading examples.
  • Most are built on the transformer architecture.
  • Capabilities emerge with scale of data, parameters and compute.
  • You adapt them by prompting, retrieval (RAG) or fine-tuning — rarely by training from scratch.

Context

Before foundation models, teams trained narrow models for each task. The foundation-model paradigm flips this: one large model is pretrained once on broad data, then reused everywhere. That reuse is why a handful of models now underpin most AI products.

It also concentrates capability — and risk. Because so much is built on a few bases, their biases, failures and security properties propagate downstream, which is part of why governance and evaluation matter.

Architecture

Pretraining: a model with millions to trillions of parameters learns general patterns from massive datasets, typically with self-supervised objectives like next-token prediction. The transformer's attention mechanism makes this scalable.

Adaptation: the same base is specialized for use — zero/few-shot prompting, retrieval-augmented generation for fresh or private knowledge, or fine-tuning for behavior and domain. Agents wrap the model in tools and a harness.

Components

Transformer architecturePretraining data & objectiveParameters (weights)TokenizerAdaptation layer (prompt / RAG / fine-tune)

Benefits

  • One base reused across many tasks.
  • Strong general capability out of the box.
  • Rapid adaptation without training from scratch.
  • Multimodal variants span text, image, audio and more.

Risks

  • Concentrated risk: flaws propagate to everything built on them.
  • Costly to pretrain; few organizations can.
  • Inherit biases and gaps from training data.
  • Knowledge is frozen at training time without retrieval.

Tools & technologies

Frontier LLMs (Claude, GPT, Gemini)Open-weight models (Llama, Mistral)Multimodal modelsModel hosting / inference platforms

Examples

  • Using one LLM for summarization, classification and drafting across an org.
  • Adapting a base model to a domain with retrieval instead of retraining.
  • Building an agent on a frontier model plus tools and memory.

FAQs

Is a foundation model the same as an LLM?
An LLM is the most common type of foundation model, specialized to language. Foundation models also include multimodal and other general-purpose models.
Why are they called 'foundation' models?
Because they serve as a shared base that many applications are built on, rather than a model trained for a single task.
Do I need to train one?
Almost never. Pretraining is extremely costly; nearly all value comes from adapting an existing base via prompting, retrieval or fine-tuning.
How do agents relate to foundation models?
An agent uses a foundation model as its reasoning core, wrapped in tools, memory and a control loop — the harness — to take actions.

References