Claude Code Foundations of Context Engineering Created: 13 Apr 2026 Updated: 13 Apr 2026

Intelligent Context Retrieval

In the previous article we explored how you write context — CLAUDE.md files, auto memory entries, and hooks that fire automatically. Writing context is only half the story. The other half is how Claude Code retrieves that context at the right moment, without you having to load it manually. This article covers the retrieval side: how Claude discovers, prioritises, and propagates context so the model always has the information it needs.

This is context engineering at the application level. The Claude Code developers built a retrieval system that automatically walks your directory tree, loads the most relevant instructions, and feeds tool-specific context into the model. Understanding how this works helps you organise your project so Claude always finds what it needs — and never wastes context window space on what it does not.

1. What Loads Before You Type Anything

Before you even type your first prompt, Claude Code silently loads a set of files into the context window. This is the automatic startup payload — and it sets the baseline for every session.

The Startup Payload

SourceWhat LoadsWhen
System promptBuilt-in instructions and output styleAlways, before anything else
CLAUDE.md hierarchyEvery CLAUDE.md and CLAUDE.local.md from CWD up to the rootAlways, at session start
Auto memoryFirst 200 lines or 25 KB of MEMORY.mdAlways, at session start
MCP tool namesOnly tool names and short descriptions (deferred loading)Always, at session start
Skill descriptionsShort descriptions only; full skill bodies load on demandAlways, at session start

Notice the pattern: Claude loads lightweight metadata up front (tool names, skill descriptions) and defers the heavy content (full tool definitions, skill bodies) until they are actually needed. This is a deliberate design choice to preserve context window space.

You can inspect your own startup payload at any time:

# See a live breakdown of context usage by category
/context

# Check which CLAUDE.md and auto memory files loaded
/memory

2. Dynamic Context Discovery — The Directory Tree Walk

The most important retrieval mechanism in Claude Code is the directory tree walk. When you start a session, Claude Code walks up the directory tree from your current working directory, collecting every CLAUDE.md and CLAUDE.local.md file it finds along the way.

How It Works

Suppose your project has the following layout and you run claude from foo/bar/:

~/projects/
├── CLAUDE.md ← loaded (ancestor)
├── foo/
│ ├── CLAUDE.md ← loaded (parent)
│ └── bar/
│ ├── CLAUDE.md ← loaded (CWD)
│ └── src/
│ └── CLAUDE.md ← NOT loaded yet (subdirectory)

Claude Code loads all three ancestor files — from the CWD up to the project root. All discovered files are concatenated, not overridden. Every piece of instruction is included. At each level, if a CLAUDE.local.md exists it is appended after the CLAUDE.md at that level, giving it last-read position.

Subdirectory Files Load On Demand

Subdirectory CLAUDE.md files — like src/CLAUDE.md in the tree above — are not loaded at startup. They load automatically when Claude reads or edits a file inside that subdirectory. This is on-demand loading: the context appears exactly when it is relevant, and not before.

This design means you can place folder-specific instructions (coding standards for src/, test conventions for tests/) right where they belong, without paying a context cost until Claude actually works in that folder.

Practical Example

# Project root CLAUDE.md — broad instructions
# ~/projects/my-app/CLAUDE.md

This is a TypeScript monorepo using pnpm workspaces.
Always run `pnpm test` before committing.
Use conventional commits.
# Frontend subfolder CLAUDE.md — specific instructions
# ~/projects/my-app/packages/frontend/CLAUDE.md

This package uses React 19 with server components.
Prefer the `use()` hook over `useEffect` for data fetching.
Run `pnpm --filter frontend dev` to start the dev server.

When Claude reads a file in packages/frontend/, the frontend-specific CLAUDE.md loads alongside the root-level one. Claude now knows both the monorepo conventions and the frontend-specific patterns.

3. Specificity and Priority

Since multiple CLAUDE.md files can be loaded simultaneously, a natural question arises: what happens when they conflict? Claude Code uses a specificity model — more specific context takes priority over broader context.

Priority Order

  1. Subdirectory CLAUDE.md — most specific, loaded on demand when Claude reads files in that folder.
  2. CWD CLAUDE.md — your working directory's instructions.
  3. Parent/ancestor CLAUDE.md — loaded during the tree walk, broadest scope.
  4. User-level ~/CLAUDE.md — personal preferences, loaded first but lowest priority.

Within each level, CLAUDE.local.md is appended after CLAUDE.md, so local overrides take effect. This makes CLAUDE.local.md ideal for personal or machine-specific settings that should not be committed to version control.

Path-Specific Rules

For fine-grained control, you can use path-specific rules in the .claude/rules/ directory. Each rule file can have a paths: frontmatter that specifies which files it applies to:

# .claude/rules/react-components.md
---
paths:
- "src/components/**/*.tsx"
- "src/components/**/*.ts"
---

Use functional components with hooks.
Always add displayName for debugging.
Export components as named exports, not default.

This rule only loads into context when Claude reads a file matching the glob pattern. It never pollutes context for unrelated work. Think of path-specific rules as surgical context injections — they fire precisely when needed.

Comparison Table

MechanismScopeWhen It LoadsSurvives Compaction?
Project-root CLAUDE.mdWhole projectSession startYes — re-injected from disk
Subdirectory CLAUDE.mdFiles in that folderOn demandNo — reloads when Claude reads a matching file
PATH-scoped .claude/rules/Files matching globOn demandNo — reloads when Claude reads a matching file
Unscoped .claude/rules/Whole projectSession startYes — re-injected from disk
CLAUDE.local.mdSame as paired CLAUDE.mdSame as paired CLAUDE.mdSame as paired CLAUDE.md
Auto memory (MEMORY.md)Project or userSession start (first 200 lines)Yes — re-injected from disk

4. Recency and Frequency Prioritisation — Auto Memory

Auto memory is Claude Code's mechanism for learning from your sessions and carrying those learnings forward. It lives in ~/.claude/projects/<project-hash>/memory/ and consists of an index file (MEMORY.md) plus optional topic files.

How Auto Memory Loads

  1. At session start, Claude reads the first 200 lines or 25 KB of MEMORY.md — whichever limit is reached first.
  2. Topic files referenced in MEMORY.md are read on demand when Claude determines they are relevant to the current task.
  3. During the session, Claude may add new memories when it learns something worth persisting — a project pattern, a user preference, a correction you made.

Because the most recent memories are appended to MEMORY.md, and the first 200 lines load at startup, the system naturally prioritises recent information. If your MEMORY.md grows beyond 200 lines, older entries at the bottom are not loaded unless Claude decides to read the full file.

The /memory Command

You can interact with the memory system at any point during a session:

# List all loaded memory files and their status
/memory

# Ask Claude to remember something
/memory add Always use descriptive variable names

# Claude will prompt: project-level or user-level?
# Then update the appropriate file automatically.

When you run /memory add, Claude asks whether the memory should be stored at the project level (in MEMORY.md inside the project's memory folder) or the user level (in a personal MEMORY.md that loads across all projects). This distinction matters:

  1. Project-level memory: "This project uses Vitest, not Jest" — relevant only to one codebase.
  2. User-level memory: "Always use descriptive variable names" — a personal preference that applies everywhere.

You can also ask Claude to remember things conversationally:

> Remember that we use Prettier with single quotes in this project.

Claude: I'll save that as a project-level memory.

5. Tool-Specific Context Propagation

One of the most sophisticated aspects of Claude Code's retrieval system is tool-specific context propagation. When Claude uses different tools, the system provides different context to guide the model's behaviour. This is not random — it is carefully designed context engineering at the application level.

The Edit Tool

When Claude is about to edit a file, the system encourages it to:

  1. Check the existing code style first — indentation, naming conventions, import patterns.
  2. Look for existing functions before creating new ones — avoid duplication.
  3. Read surrounding code to understand the patterns already in use.

This is why Claude often reads a file before editing it, even when you might think it already "knows" the content. The tool context explicitly pushes Claude toward understanding before acting.

The Terminal Tool

When Claude is about to run a shell command, the system encourages it to:

  1. Check if there is an existing npm script (or equivalent) before running raw commands.
  2. Verify the file path exists before executing operations on it.
  3. Use project-specific toolingpnpm vs npm, pytest vs unittest.

This is how Claude "knows" to run pnpm test instead of npm test in a pnpm workspace — the combination of your CLAUDE.md instructions and the terminal tool's built-in context produces the right command.

The Search Tool

When Claude searches your codebase, the system propagates context about:

  1. File patterns to search — respecting .gitignore and project structure.
  2. Current working directory — scoping the search appropriately.
  3. Recently read files — avoiding redundant searches for content already in context.

Why This Matters

Tool-specific context propagation means that the same Claude model behaves differently depending on what tool it is using. When editing, it is careful and style-aware. When running commands, it is cautious and project-aware. When searching, it is efficient and scope-aware. This is context engineering — shaping model behaviour through strategic information injection rather than through model fine-tuning.

6. The Agentic Loop and Context Flow

All of these retrieval mechanisms feed into Claude Code's agentic loop — the three-phase cycle of gather context → take action → verify results. Understanding the loop helps you see how context flows through a real task.

The Three Phases

  1. Gather context: Claude reads files, searches the codebase, checks git status, loads subdirectory CLAUDE.md files, and fires path-specific rules. Every tool use adds information to the context window.
  2. Take action: Claude edits files, runs commands, or delegates to subagents. The tool-specific context propagation guides each action.
  3. Verify results: Claude runs tests, checks for errors, reads output, and decides whether the task is complete or needs another iteration.

These phases blend together. Claude might gather more context while taking action, or verify a partial result before continuing. The loop adapts to what you ask — a simple question might only need phase 1, while a complex refactor cycles through all three phases repeatedly.

Example: Fixing a Bug

> Fix the authentication bug in the login flow

Phase 1 — Gather context:
• Claude searches for "auth" and "login" across the codebase
• Reads src/auth/login.ts → subdirectory CLAUDE.md in src/auth/ loads
• Path-specific rule for "src/auth/**" fires
• Reads test files to understand expected behaviour

Phase 2 — Take action:
• Edits src/auth/login.ts (edit tool context: check existing style)
• Edits src/auth/session.ts (looks for existing helper functions first)

Phase 3 — Verify results:
• Runs the test suite (terminal tool context: uses project's test script)
• Reads test output, confirms all tests pass
• Commits the fix if you asked for it

Notice how context retrieval happens throughout. The subdirectory CLAUDE.md loaded because Claude read a file in src/auth/. The path-specific rule fired because the file matched its glob. The terminal tool knew which test command to use because of the project-root CLAUDE.md. None of this required manual intervention.

7. Context Window Management

The context window is a finite resource. As you work, it fills with conversation history, file contents, command outputs, and loaded instructions. Claude Code provides several mechanisms to manage this space effectively.

/clear — Reset the Conversation

# Start fresh within the same session
/clear

Use /clear between unrelated tasks. It resets the conversation history while keeping the session alive. CLAUDE.md and auto memory reload automatically.

/compact — Summarise the Conversation

# Compact with default behaviour
/compact

# Compact with a focus instruction
/compact focus on the API changes we discussed

/compact replaces the full conversation history with a structured summary. You can provide a focus instruction to guide what is preserved. This is useful when you have been working on multiple things and want to narrow context to the most relevant work.

Auto Compaction

When the context window approaches its limit, Claude Code automatically compacts. It clears older tool outputs first, then summarises the conversation if needed. Your requests and key code snippets are preserved; detailed instructions from early in the conversation may be lost.

This is why persistent rules belong in CLAUDE.md, not in conversation — CLAUDE.md reloads from disk after compaction, while conversational instructions are summarised away.

What Survives Compaction

ContentAfter Compaction
System prompt and output styleUnchanged — not part of message history
Project-root CLAUDE.md and unscoped rulesRe-injected from disk
Auto memory (MEMORY.md)Re-injected from disk
Path-scoped rulesLost until a matching file is read again
Subdirectory CLAUDE.md filesLost until a file in that directory is read again
Invoked skill bodiesRe-injected (capped at 5,000 tokens per skill, 25,000 total)
HooksNot applicable — hooks run as code, not context

/btw — Side Questions Without Growing Context

# Ask a quick side question
/btw What's the syntax for a TypeScript mapped type?

/btw lets you ask a quick, unrelated question without adding it to the main conversation context. This is useful when you need a quick reference but do not want to pollute the context window.

8. Subagents — Separate Context Windows

When Claude needs to do extensive research or work on a large subtask, it can delegate to a subagent. Subagents run in their own separate context window — completely independent from your main conversation.

Why Subagents Matter for Context

  1. Large file reads and search results stay in the subagent's context, not yours.
  2. Only the summary comes back to your main conversation.
  3. Your context window stays clean for the work you are doing.
> Research how our authentication system works and summarise it

Claude delegates to a subagent which:
• Reads 15 files across src/auth/
• Searches for session handling patterns
• Analyses the middleware chain

Only a concise summary returns to your main context.
The 15 file reads stay in the subagent's context, not yours.

This is why subagents are especially valuable in long sessions — they prevent context bloat from research-heavy tasks. The subagent does the heavy lifting in its own context, and your main conversation receives only the distilled result.

9. Skills — On-Demand Context Loading

Skills are another on-demand retrieval mechanism. At session start, Claude sees only the skill descriptions — a few lines per skill. The full skill body (the SKILL.md file) loads only when Claude decides the skill is relevant to your request.

Skills vs CLAUDE.md

FeatureCLAUDE.mdSkills
Loads at session startYes — full contentNo — description only
Context cost at startupFull file size~1–2 lines per skill
When full content loadsImmediatelyWhen the skill is invoked
After compactionRe-injected from diskRe-injected (capped at 5K tokens/skill)
Best forRules every session needsSpecialised workflows triggered occasionally

Use CLAUDE.md for instructions that apply to nearly every session. Use skills for specialised workflows — deployment procedures, migration guides, review checklists — that are only relevant occasionally. This keeps your startup context lean while still making the knowledge available when needed.

10. MCP Tool Search — Deferred Tool Loading

MCP (Model Context Protocol) servers can expose dozens or hundreds of tools. Loading all tool definitions at startup would be wasteful. Instead, Claude Code uses deferred tool loading:

  1. At session start, only tool names and short descriptions are loaded.
  2. When Claude decides it needs a specific tool, the full tool definition (parameters, schema) loads on demand.
  3. This keeps the startup cost proportional to the number of tools, not their complexity.
# Check per-server context costs
/mcp

# See overall context breakdown
/context

This is the same pattern as skills: lightweight metadata up front, heavy content on demand. If you have many MCP servers configured, their tool names still consume context, but the full definitions do not load until needed.

11. Best Practices for Context Retrieval

Now that you understand how context retrieval works, here are practical guidelines for organising your project to take full advantage of it.

Structure Your CLAUDE.md Hierarchy

  1. Put universal rules in the project-root CLAUDE.md — they load every session and survive compaction.
  2. Put folder-specific rules in subdirectory CLAUDE.md files — they load on demand, saving context for sessions that do not touch those folders.
  3. Put personal overrides in CLAUDE.local.md — they are appended after CLAUDE.md at each level.

Use Path-Specific Rules for Precision

  1. Create rules in .claude/rules/ with glob patterns for file-type-specific conventions.
  2. These rules load only when Claude reads a matching file — zero context cost otherwise.
  3. Good for: component patterns, test conventions, API style guides.

Keep Auto Memory Lean

  1. Only the first 200 lines of MEMORY.md load at startup. Keep the most important memories near the top.
  2. Periodically review and clean up auto memory: /memory opens the memory folder.
  3. Use project-level memory for project-specific facts, user-level for personal preferences.

Manage Context Window Actively

  1. Run /clear between unrelated tasks.
  2. Run /compact when context fills up during long sessions.
  3. Use subagents for research-heavy tasks that would bloat your main context.
  4. Use /btw for quick side questions.
  5. Run /context to see what is consuming space.

Move Specialised Knowledge to Skills

  1. Instructions that you use in fewer than half your sessions should be skills, not CLAUDE.md entries.
  2. Skills load on demand, so they do not consume context until needed.
  3. Set disable-model-invocation: true for skills you invoke manually — this keeps even the description out of context until you trigger it.

12. Summary

Intelligent context retrieval is the hidden engine that makes Claude Code effective. You write the context once — in CLAUDE.md files, rules, auto memory, and skills — and the retrieval system ensures it reaches the model at the right moment. The key mechanisms are:

  1. Directory tree walk: Claude walks up from CWD, loading every CLAUDE.md along the way.
  2. On-demand loading: Subdirectory CLAUDE.md, path-specific rules, skills, and MCP tool definitions load only when relevant.
  3. Specificity priority: More specific context (subdirectory) takes precedence over broader context (root).
  4. Auto memory with recency bias: Recent memories load first; older entries may not load at startup.
  5. Tool-specific propagation: Different tools receive different context to guide the model's behaviour.
  6. Subagent isolation: Heavy research runs in separate context windows, keeping yours clean.
  7. Compaction resilience: Root CLAUDE.md and auto memory survive compaction; path-scoped content reloads on demand.

When you organise your project with this retrieval system in mind, Claude always has the right context at the right time — without manual loading and without wasting context window space.



Share this lesson: