← All posts

Context engineering is the right question. Here are the terms for the answers.

In 2025, Andrej Karpathy named what engineers working with AI agents had been circling around for two years: the problem is not the prompt. The problem is the entire context the model receives, and whether that context is designed to make the task solvable. Tobi Lutke compressed it further: context engineering is "the art of providing all the context for the task to be plausibly solvable by the LLM."

That framing is right. It is also incomplete.

Naming the problem as "context engineering" tells you where to look. It does not tell you what breaks, why it breaks, or what to build to prevent it from breaking again. The vocabulary for that part does not exist in the context engineering literature, because that literature is about the frame, not the components.

Cognitive Interface Architecture is the vocabulary for the components.

What context engineering requires you to build

When engineers say they are doing "context engineering," they are managing six distinct problems simultaneously. Each one has a name in the Cognitive Interface Architecture vocabulary. Here is the mapping.

Problem 1: You do not know what the model is actually receiving

You write a system prompt. You add tool definitions. You retrieve documents. You have a conversation history. The model receives all of that at once, as a single undifferentiated input. You have no direct view of it.

The Control Surface is the term for this totality: every token the model receives at the moment of generation. System prompt, conversation history, tool schemas, retrieved context, everything. Context engineering is, at its core, the discipline of designing the Control Surface deliberately rather than letting it accumulate by accident.

The failure mode when you ignore the Control Surface: context pollution. Retrieved documents contradict the system prompt. Tool schemas consume token budget you needed for examples. The model resolves the contradictions by guessing. Production output is inconsistent in ways you cannot trace to a single cause because you were not reasoning about the full Control Surface.

Problem 2: You cannot fit everything into the window

Context windows have grown. The failure mode has not disappeared. It has shifted from "cannot fit" to "attends poorly to content near the edges" and "costs scale with token count in ways that compound across agentic loops."

Context Window Budgeting is the term for the practice of treating the context window as a resource with a hard budget: assigning token allocations to system prompt, examples, retrieved context, and conversation history before starting a task, and enforcing those allocations rather than letting them drift.

The failure mode when you skip budgeting: a multi-step agent loop that works on step 1 and step 2 accumulates context across steps until the window overflows on step 7, mid-task, with no fallback behaviour defined. The agent either truncates silently or errors. Neither outcome is recoverable.

Problem 3: Your constraints conflict with each other

You write a prompt with twelve constraints. Eleven of them are fine. One of them is mutually exclusive with another one, and you do not find out until the model starts producing output that satisfies neither.

Constraint Satisfaction Problem is the framing tool: before deploying a constraint set, check every constraint pair for mutual exclusivity. This is design-time work, not runtime work. The model cannot resolve conflicting constraints reliably; it will satisfy one and violate the other, and its choice will not be consistent across runs.

Constraint Architecture is the practice of organising those constraints into a coherent system: what must be true, what must never happen, what the model should prefer when all mandatory constraints are satisfied. Context engineering without Constraint Architecture produces context that is rich in information and underspecified in what to do with it.

Problem 4: You have not defined what the model should do when it cannot answer

A model with no fallback behaviour resolves uncertainty with confident output. It does not say "I cannot determine this from the context provided." It produces the most plausible-sounding answer it can construct and returns it with no signal that it is uncertain.

The Fallback Cascade is a tiered policy: at each level of scope ambiguity, the agent has an explicit instruction. Refuse, narrow to in-scope portion, confirm with the user, or execute. Each tier is more permissive than the last. The agent escalates only when the current tier cannot handle the request safely.

Without a Fallback Cascade, context engineering produces a model that knows a great deal about the task and has no instruction for what to do when the task exceeds what the context makes answerable. It guesses.

Problem 5: You have not specified what requires human review before execution

Context engineering makes agents capable. Capable agents take actions. Some of those actions are irreversible: deleting files, sending messages, committing code, making API calls that write to external systems.

The Circuit Breaker is the prompt-level gate: certain operations require an explicit unlock phrase before the agent proceeds. The Circuit Breaker is not a guardrail against bad output. It is a gate against irreversible action. It is the part of context engineering that the "context" framing does not naturally surface, because it is not about what information the model has. It is about what the model is permitted to do with that information.

Problem 6: You do not have a structured test for whether your context design is correct

Context engineering without validation is optimism. You design the context, run the task, the output looks right, and you ship it. Then it fails on inputs that vary slightly from the ones you tested, because the context design had assumptions you did not document.

The Ground Truth Contract is that documentation: an explicit statement, written before testing begins, of what correct output looks like, what incorrect output looks like, and which errors are acceptable vs. blocking. It converts the implicit success criterion in your head into a testable specification.

The Three-Constraint Rule is the diagnostic check for why a context design fails: every prompt that produces unreliable output is missing a complete specification of Intent (what you want), Context (the world the model is operating in), or Guardrails (what it must not do). The Three-Constraint Rule is a structured audit, not a style preference. Missing any one of the three produces a specific and predictable failure pattern.

Why the field frame and the CIA vocabulary are different things

Context engineering is a frame. It tells you that the unit of design is not the prompt but the full context the model receives. That reframe is correct and important.

What the frame does not provide is a vocabulary for the components of that context, the failure modes that emerge from each component, and the architectural patterns that prevent each failure mode. The frame tells you where to look. The vocabulary tells you what you are looking at and what to name it when it breaks.

This is not a critique of context engineering as a concept. It is a description of what it is and what it is not. The Cognitive Interface Architecture vocabulary exists in the space the frame does not fill.

The full vocabulary for designing, debugging, and governing AI agent systems (all 25 precision terms) is on the Method page, free and public. For engineers who want the complete applied toolkit: 12 deployable system prompts, AGENTS.md templates, and fully-worked diagnostic rebuilds in the Agent Control Architecture Pack. The Constraint newsletter covers one term, one production failure, one architectural pattern per issue. Subscribe free.

The Cognitive Interface Architecture framework

All 25 precision terms, the Prompt Maturity Model, and the vocabulary that makes AI agent failures diagnosable.

Read the Method → Subscribe to The Constraint