What are Foundation Models?
A foundation model is a large model pretrained on broad data at scale that can be adapted to a wide range of downstream tasks. Large language models (LLMs) and large multimodal models are the canonical examples. The term, coined at Stanford in 2021, captures a shift: instead of training a bespoke model per task, organizations build on a shared, general-purpose base — then specialize it through prompting, retrieval or fine-tuning.
Definition
A foundation model is a large, general-purpose model pretrained on broad data that serves as a base which can be adapted — via prompting, retrieval or fine-tuning — to many downstream tasks.
Key takeaways
- Foundation models are general bases adapted to many tasks.
- LLMs and multimodal models are the leading examples.
- Most are built on the transformer architecture.
- Capabilities emerge with scale of data, parameters and compute.
- You adapt them by prompting, retrieval (RAG) or fine-tuning — rarely by training from scratch.
Context
Before foundation models, teams trained narrow models for each task. The foundation-model paradigm flips this: one large model is pretrained once on broad data, then reused everywhere. That reuse is why a handful of models now underpin most AI products.
It also concentrates capability — and risk. Because so much is built on a few bases, their biases, failures and security properties propagate downstream, which is part of why governance and evaluation matter.
Architecture
Pretraining: a model with millions to trillions of parameters learns general patterns from massive datasets, typically with self-supervised objectives like next-token prediction. The transformer's attention mechanism makes this scalable.
Adaptation: the same base is specialized for use — zero/few-shot prompting, retrieval-augmented generation for fresh or private knowledge, or fine-tuning for behavior and domain. Agents wrap the model in tools and a harness.
Components
Benefits
- One base reused across many tasks.
- Strong general capability out of the box.
- Rapid adaptation without training from scratch.
- Multimodal variants span text, image, audio and more.
Risks
- Concentrated risk: flaws propagate to everything built on them.
- Costly to pretrain; few organizations can.
- Inherit biases and gaps from training data.
- Knowledge is frozen at training time without retrieval.
Tools & technologies
Examples
- Using one LLM for summarization, classification and drafting across an org.
- Adapting a base model to a domain with retrieval instead of retraining.
- Building an agent on a frontier model plus tools and memory.
FAQs
- Is a foundation model the same as an LLM?
- An LLM is the most common type of foundation model, specialized to language. Foundation models also include multimodal and other general-purpose models.
- Why are they called 'foundation' models?
- Because they serve as a shared base that many applications are built on, rather than a model trained for a single task.
- Do I need to train one?
- Almost never. Pretraining is extremely costly; nearly all value comes from adapting an existing base via prompting, retrieval or fine-tuning.
- How do agents relate to foundation models?
- An agent uses a foundation model as its reasoning core, wrapped in tools, memory and a control loop — the harness — to take actions.