Intelligent Context Retrieval
In the previous article we explored how you write context — CLAUDE.md files, auto memory entries, and hooks that fire automatically. Writing context is only half the story. The other half is how Claude Code retrieves that context at the right moment, without you having to load it manually. This article covers the retrieval side: how Claude discovers, prioritises, and propagates context so the model always has the information it needs.
This is context engineering at the application level. The Claude Code developers built a retrieval system that automatically walks your directory tree, loads the most relevant instructions, and feeds tool-specific context into the model. Understanding how this works helps you organise your project so Claude always finds what it needs — and never wastes context window space on what it does not.
1. What Loads Before You Type Anything
Before you even type your first prompt, Claude Code silently loads a set of files into the context window. This is the automatic startup payload — and it sets the baseline for every session.
The Startup Payload
| Source | What Loads | When |
|---|---|---|
| System prompt | Built-in instructions and output style | Always, before anything else |
| CLAUDE.md hierarchy | Every CLAUDE.md and CLAUDE.local.md from CWD up to the root | Always, at session start |
| Auto memory | First 200 lines or 25 KB of MEMORY.md | Always, at session start |
| MCP tool names | Only tool names and short descriptions (deferred loading) | Always, at session start |
| Skill descriptions | Short descriptions only; full skill bodies load on demand | Always, at session start |
Notice the pattern: Claude loads lightweight metadata up front (tool names, skill descriptions) and defers the heavy content (full tool definitions, skill bodies) until they are actually needed. This is a deliberate design choice to preserve context window space.
You can inspect your own startup payload at any time:
2. Dynamic Context Discovery — The Directory Tree Walk
The most important retrieval mechanism in Claude Code is the directory tree walk. When you start a session, Claude Code walks up the directory tree from your current working directory, collecting every CLAUDE.md and CLAUDE.local.md file it finds along the way.
How It Works
Suppose your project has the following layout and you run claude from foo/bar/:
Claude Code loads all three ancestor files — from the CWD up to the project root. All discovered files are concatenated, not overridden. Every piece of instruction is included. At each level, if a CLAUDE.local.md exists it is appended after the CLAUDE.md at that level, giving it last-read position.
Subdirectory Files Load On Demand
Subdirectory CLAUDE.md files — like src/CLAUDE.md in the tree above — are not loaded at startup. They load automatically when Claude reads or edits a file inside that subdirectory. This is on-demand loading: the context appears exactly when it is relevant, and not before.
This design means you can place folder-specific instructions (coding standards for src/, test conventions for tests/) right where they belong, without paying a context cost until Claude actually works in that folder.
Practical Example
When Claude reads a file in packages/frontend/, the frontend-specific CLAUDE.md loads alongside the root-level one. Claude now knows both the monorepo conventions and the frontend-specific patterns.
3. Specificity and Priority
Since multiple CLAUDE.md files can be loaded simultaneously, a natural question arises: what happens when they conflict? Claude Code uses a specificity model — more specific context takes priority over broader context.
Priority Order
- Subdirectory CLAUDE.md — most specific, loaded on demand when Claude reads files in that folder.
- CWD CLAUDE.md — your working directory's instructions.
- Parent/ancestor CLAUDE.md — loaded during the tree walk, broadest scope.
- User-level ~/CLAUDE.md — personal preferences, loaded first but lowest priority.
Within each level, CLAUDE.local.md is appended after CLAUDE.md, so local overrides take effect. This makes CLAUDE.local.md ideal for personal or machine-specific settings that should not be committed to version control.
Path-Specific Rules
For fine-grained control, you can use path-specific rules in the .claude/rules/ directory. Each rule file can have a paths: frontmatter that specifies which files it applies to:
This rule only loads into context when Claude reads a file matching the glob pattern. It never pollutes context for unrelated work. Think of path-specific rules as surgical context injections — they fire precisely when needed.
Comparison Table
| Mechanism | Scope | When It Loads | Survives Compaction? |
|---|---|---|---|
| Project-root CLAUDE.md | Whole project | Session start | Yes — re-injected from disk |
| Subdirectory CLAUDE.md | Files in that folder | On demand | No — reloads when Claude reads a matching file |
| PATH-scoped .claude/rules/ | Files matching glob | On demand | No — reloads when Claude reads a matching file |
| Unscoped .claude/rules/ | Whole project | Session start | Yes — re-injected from disk |
| CLAUDE.local.md | Same as paired CLAUDE.md | Same as paired CLAUDE.md | Same as paired CLAUDE.md |
| Auto memory (MEMORY.md) | Project or user | Session start (first 200 lines) | Yes — re-injected from disk |
4. Recency and Frequency Prioritisation — Auto Memory
Auto memory is Claude Code's mechanism for learning from your sessions and carrying those learnings forward. It lives in ~/.claude/projects/<project-hash>/memory/ and consists of an index file (MEMORY.md) plus optional topic files.
How Auto Memory Loads
- At session start, Claude reads the first 200 lines or 25 KB of
MEMORY.md— whichever limit is reached first. - Topic files referenced in
MEMORY.mdare read on demand when Claude determines they are relevant to the current task. - During the session, Claude may add new memories when it learns something worth persisting — a project pattern, a user preference, a correction you made.
Because the most recent memories are appended to MEMORY.md, and the first 200 lines load at startup, the system naturally prioritises recent information. If your MEMORY.md grows beyond 200 lines, older entries at the bottom are not loaded unless Claude decides to read the full file.
The /memory Command
You can interact with the memory system at any point during a session:
When you run /memory add, Claude asks whether the memory should be stored at the project level (in MEMORY.md inside the project's memory folder) or the user level (in a personal MEMORY.md that loads across all projects). This distinction matters:
- Project-level memory: "This project uses Vitest, not Jest" — relevant only to one codebase.
- User-level memory: "Always use descriptive variable names" — a personal preference that applies everywhere.
You can also ask Claude to remember things conversationally:
5. Tool-Specific Context Propagation
One of the most sophisticated aspects of Claude Code's retrieval system is tool-specific context propagation. When Claude uses different tools, the system provides different context to guide the model's behaviour. This is not random — it is carefully designed context engineering at the application level.
The Edit Tool
When Claude is about to edit a file, the system encourages it to:
- Check the existing code style first — indentation, naming conventions, import patterns.
- Look for existing functions before creating new ones — avoid duplication.
- Read surrounding code to understand the patterns already in use.
This is why Claude often reads a file before editing it, even when you might think it already "knows" the content. The tool context explicitly pushes Claude toward understanding before acting.
The Terminal Tool
When Claude is about to run a shell command, the system encourages it to:
- Check if there is an existing npm script (or equivalent) before running raw commands.
- Verify the file path exists before executing operations on it.
- Use project-specific tooling —
pnpmvsnpm,pytestvsunittest.
This is how Claude "knows" to run pnpm test instead of npm test in a pnpm workspace — the combination of your CLAUDE.md instructions and the terminal tool's built-in context produces the right command.
The Search Tool
When Claude searches your codebase, the system propagates context about:
- File patterns to search — respecting
.gitignoreand project structure. - Current working directory — scoping the search appropriately.
- Recently read files — avoiding redundant searches for content already in context.
Why This Matters
Tool-specific context propagation means that the same Claude model behaves differently depending on what tool it is using. When editing, it is careful and style-aware. When running commands, it is cautious and project-aware. When searching, it is efficient and scope-aware. This is context engineering — shaping model behaviour through strategic information injection rather than through model fine-tuning.
6. The Agentic Loop and Context Flow
All of these retrieval mechanisms feed into Claude Code's agentic loop — the three-phase cycle of gather context → take action → verify results. Understanding the loop helps you see how context flows through a real task.
The Three Phases
- Gather context: Claude reads files, searches the codebase, checks git status, loads subdirectory CLAUDE.md files, and fires path-specific rules. Every tool use adds information to the context window.
- Take action: Claude edits files, runs commands, or delegates to subagents. The tool-specific context propagation guides each action.
- Verify results: Claude runs tests, checks for errors, reads output, and decides whether the task is complete or needs another iteration.
These phases blend together. Claude might gather more context while taking action, or verify a partial result before continuing. The loop adapts to what you ask — a simple question might only need phase 1, while a complex refactor cycles through all three phases repeatedly.
Example: Fixing a Bug
Notice how context retrieval happens throughout. The subdirectory CLAUDE.md loaded because Claude read a file in src/auth/. The path-specific rule fired because the file matched its glob. The terminal tool knew which test command to use because of the project-root CLAUDE.md. None of this required manual intervention.
7. Context Window Management
The context window is a finite resource. As you work, it fills with conversation history, file contents, command outputs, and loaded instructions. Claude Code provides several mechanisms to manage this space effectively.
/clear — Reset the Conversation
Use /clear between unrelated tasks. It resets the conversation history while keeping the session alive. CLAUDE.md and auto memory reload automatically.
/compact — Summarise the Conversation
/compact replaces the full conversation history with a structured summary. You can provide a focus instruction to guide what is preserved. This is useful when you have been working on multiple things and want to narrow context to the most relevant work.
Auto Compaction
When the context window approaches its limit, Claude Code automatically compacts. It clears older tool outputs first, then summarises the conversation if needed. Your requests and key code snippets are preserved; detailed instructions from early in the conversation may be lost.
This is why persistent rules belong in CLAUDE.md, not in conversation — CLAUDE.md reloads from disk after compaction, while conversational instructions are summarised away.
What Survives Compaction
| Content | After Compaction |
|---|---|
| System prompt and output style | Unchanged — not part of message history |
| Project-root CLAUDE.md and unscoped rules | Re-injected from disk |
| Auto memory (MEMORY.md) | Re-injected from disk |
| Path-scoped rules | Lost until a matching file is read again |
| Subdirectory CLAUDE.md files | Lost until a file in that directory is read again |
| Invoked skill bodies | Re-injected (capped at 5,000 tokens per skill, 25,000 total) |
| Hooks | Not applicable — hooks run as code, not context |
/btw — Side Questions Without Growing Context
/btw lets you ask a quick, unrelated question without adding it to the main conversation context. This is useful when you need a quick reference but do not want to pollute the context window.
8. Subagents — Separate Context Windows
When Claude needs to do extensive research or work on a large subtask, it can delegate to a subagent. Subagents run in their own separate context window — completely independent from your main conversation.
Why Subagents Matter for Context
- Large file reads and search results stay in the subagent's context, not yours.
- Only the summary comes back to your main conversation.
- Your context window stays clean for the work you are doing.
This is why subagents are especially valuable in long sessions — they prevent context bloat from research-heavy tasks. The subagent does the heavy lifting in its own context, and your main conversation receives only the distilled result.
9. Skills — On-Demand Context Loading
Skills are another on-demand retrieval mechanism. At session start, Claude sees only the skill descriptions — a few lines per skill. The full skill body (the SKILL.md file) loads only when Claude decides the skill is relevant to your request.
Skills vs CLAUDE.md
| Feature | CLAUDE.md | Skills |
|---|---|---|
| Loads at session start | Yes — full content | No — description only |
| Context cost at startup | Full file size | ~1–2 lines per skill |
| When full content loads | Immediately | When the skill is invoked |
| After compaction | Re-injected from disk | Re-injected (capped at 5K tokens/skill) |
| Best for | Rules every session needs | Specialised workflows triggered occasionally |
Use CLAUDE.md for instructions that apply to nearly every session. Use skills for specialised workflows — deployment procedures, migration guides, review checklists — that are only relevant occasionally. This keeps your startup context lean while still making the knowledge available when needed.
10. MCP Tool Search — Deferred Tool Loading
MCP (Model Context Protocol) servers can expose dozens or hundreds of tools. Loading all tool definitions at startup would be wasteful. Instead, Claude Code uses deferred tool loading:
- At session start, only tool names and short descriptions are loaded.
- When Claude decides it needs a specific tool, the full tool definition (parameters, schema) loads on demand.
- This keeps the startup cost proportional to the number of tools, not their complexity.
This is the same pattern as skills: lightweight metadata up front, heavy content on demand. If you have many MCP servers configured, their tool names still consume context, but the full definitions do not load until needed.
11. Best Practices for Context Retrieval
Now that you understand how context retrieval works, here are practical guidelines for organising your project to take full advantage of it.
Structure Your CLAUDE.md Hierarchy
- Put universal rules in the project-root CLAUDE.md — they load every session and survive compaction.
- Put folder-specific rules in subdirectory CLAUDE.md files — they load on demand, saving context for sessions that do not touch those folders.
- Put personal overrides in CLAUDE.local.md — they are appended after CLAUDE.md at each level.
Use Path-Specific Rules for Precision
- Create rules in
.claude/rules/with glob patterns for file-type-specific conventions. - These rules load only when Claude reads a matching file — zero context cost otherwise.
- Good for: component patterns, test conventions, API style guides.
Keep Auto Memory Lean
- Only the first 200 lines of
MEMORY.mdload at startup. Keep the most important memories near the top. - Periodically review and clean up auto memory:
/memoryopens the memory folder. - Use project-level memory for project-specific facts, user-level for personal preferences.
Manage Context Window Actively
- Run
/clearbetween unrelated tasks. - Run
/compactwhen context fills up during long sessions. - Use subagents for research-heavy tasks that would bloat your main context.
- Use
/btwfor quick side questions. - Run
/contextto see what is consuming space.
Move Specialised Knowledge to Skills
- Instructions that you use in fewer than half your sessions should be skills, not CLAUDE.md entries.
- Skills load on demand, so they do not consume context until needed.
- Set
disable-model-invocation: truefor skills you invoke manually — this keeps even the description out of context until you trigger it.
12. Summary
Intelligent context retrieval is the hidden engine that makes Claude Code effective. You write the context once — in CLAUDE.md files, rules, auto memory, and skills — and the retrieval system ensures it reaches the model at the right moment. The key mechanisms are:
- Directory tree walk: Claude walks up from CWD, loading every CLAUDE.md along the way.
- On-demand loading: Subdirectory CLAUDE.md, path-specific rules, skills, and MCP tool definitions load only when relevant.
- Specificity priority: More specific context (subdirectory) takes precedence over broader context (root).
- Auto memory with recency bias: Recent memories load first; older entries may not load at startup.
- Tool-specific propagation: Different tools receive different context to guide the model's behaviour.
- Subagent isolation: Heavy research runs in separate context windows, keeping yours clean.
- Compaction resilience: Root CLAUDE.md and auto memory survive compaction; path-scoped content reloads on demand.
When you organise your project with this retrieval system in mind, Claude always has the right context at the right time — without manual loading and without wasting context window space.