Knowledge Base

Agentic AI Knowledge Base

Self-contained, citable units on applied and agentic AI — each built as a reusable knowledge unit for people, search engines and AI agents alike. Every entry carries a definition, key takeaways, FAQs, references, a date and a version.

Concepts

10
Concepts

What is Agentic AI?

Agentic AI refers to systems that pursue goals over multiple steps — planning, calling tools, acting on an environment and reacting to feedback — instead of producing a single response. It turns a language model from a text generator into an actor that can complete tasks. The shift it represents is from do-it-yourself software, where the human drives every step, to do-it-for-me software, where the system carries out the work and reports back.

Concepts

What is Agentic AI Evaluation?

Agentic AI evaluation is the practice of measuring how well an agent completes multi-step, tool-using tasks in an environment — not just the quality of a single answer. As models saturate static knowledge benchmarks, evaluation is shifting from measuring capability (what a model knows) to measuring agency (what a system can actually get done). Good evals are the feedback loop that makes harness engineering possible.

Concepts

What is an AI Agent?

An AI agent is a system that combines a model with tools, memory and a control loop to take actions toward a goal, rather than just answering a single prompt. It perceives a situation, decides what to do, acts through tools, observes the result and repeats until done. Autonomy ranges from a single tool call to long-horizon execution of complex tasks.

Concepts

What are Embeddings & Vector Search?

An embedding is a numeric vector that represents the meaning of text (or images, audio, code) so that semantically similar items sit close together in vector space. Vector search finds the nearest embeddings to a query, enabling search by meaning rather than keywords. Embeddings are the backbone of retrieval-augmented generation, semantic search, clustering and recommendation.

Concepts

What is Fine-tuning?

Fine-tuning continues training a pretrained model on a smaller, targeted dataset to specialize its behavior, style or domain knowledge. It is far cheaper than pretraining and changes the model's weights — unlike prompting or retrieval, which leave the model unchanged. Use it to lock in a consistent format, tone or skill; use retrieval instead when you need fresh or private facts.

Concepts

What are Foundation Models?

A foundation model is a large model pretrained on broad data at scale that can be adapted to a wide range of downstream tasks. Large language models (LLMs) and large multimodal models are the canonical examples. The term, coined at Stanford in 2021, captures a shift: instead of training a bespoke model per task, organizations build on a shared, general-purpose base — then specialize it through prompting, retrieval or fine-tuning.

Concepts

What is the Model Context Protocol (MCP)?

The Model Context Protocol (MCP) is an open standard for connecting AI models and agents to external tools, data sources and systems through a single, uniform interface. Introduced by Anthropic in late 2024, it standardizes how an application exposes context and capabilities to a model — acting like a universal adapter so any compliant client can talk to any compliant server.

Concepts

What is Prompt Engineering?

Prompt engineering is the practice of designing the inputs given to a language model so it produces the desired output reliably. A good prompt specifies the role, the task, the constraints, the output format and, when useful, examples. It is the most accessible lever for steering model behavior — and one layer of the broader harness around a model — but on its own it does not make a system reliable at scale.

Concepts

What are Reasoning Models?

Reasoning models are language models trained to spend extra computation 'thinking' before they answer — generating internal reasoning steps to solve harder problems in math, code and logic. They trade latency and cost for accuracy on complex, multi-step tasks. The key idea is test-time compute: letting a model reason longer at inference, rather than only making the model bigger, can substantially improve results.

Concepts

What is Tool Use (Function Calling)?

Tool use, also called function calling, lets a language model invoke external functions, APIs or code to fetch information or take actions in the real world. The model decides which tool to call and with what arguments; the application runs the tool and returns the result, which the model uses to continue. Tool use is the bridge that turns a text generator into an agent that can actually do things.

Harness Engineering

4
Harness Engineering

What are Agent Memory Systems?

Agent memory is how an AI agent retains and recalls information beyond a single context window — across steps, sessions and tasks. It typically separates short-term working memory (the current context) from long-term memory (durable stores the agent reads from and writes to). Memory is what lets an agent carry state through a long task, remember a user over time, and avoid repeating work. It is a core layer of harness engineering.

Harness Engineering

What is AI Agent Observability?

AI observability is the practice of instrumenting AI systems — especially agents — so you can see what they did and why. It captures traces of each step: prompts, tool calls, retrieved context, model outputs, tokens, latency and cost. Because agents are non-deterministic and multi-step, observability is what makes failures diagnosable and improvement systematic. It is the layer that feeds evaluation and closes the harness-engineering loop.

Harness Engineering

What is Context Engineering?

Context engineering is the discipline of deciding what information enters a model's limited context window at each step — and what stays out. As agents run over many steps, naively stuffing everything into context degrades quality and cost. Context engineering curates the right instructions, retrieved knowledge, tool results and memory so the model has exactly what it needs, when it needs it. It is a core part of harness engineering.

Harness Engineering

What is Harness Engineering?

Harness engineering is the discipline of designing and optimizing the scaffolding around an AI model — the prompts, tools, memory, environment, control loop and guardrails — so the model performs reliably on real tasks. Its core premise: as base models converge in raw capability, competitive advantage shifts from the model itself to the harness built around it. The same model can pass or fail a task depending almost entirely on its harness.

Patterns

2

Architectures

1

Governance

3