Search Before Read

A context-efficiency protocol that cuts token usage by 60–70% when navigating large codebases by searching for specific symbols before loading files.

Overview

The default pattern — "read the file to understand what's happening" — is the most expensive mistake in AI-assisted code navigation. A 500-line file loaded to find one function wastes 490 lines of context. This skill teaches a systematic alternative: search first, read targeted ranges only.

Inspired by the QMD (Query-Map-Dive) pattern from community codebase efficiency research.

Core Protocol

Rule 1: Never Read a File Over 200 Lines Without Searching First

If you don't know where in a file the relevant code lives, searching is always cheaper than reading.

code

# Instead of:
Read(file_path="/src/auth/middleware.ts")  # 847 lines loaded

# Do this:
Grep(pattern="validateToken", path="/src/auth/middleware.ts", output_mode="content")
# Returns: lines 147-163 — now read only those
Read(file_path="/src/auth/middleware.ts", offset=145, limit=25)

Token cost comparison:

Read entire file: ~2,000 tokens
Grep + targeted Read: ~150 tokens
Savings: ~1,850 tokens per lookup

Rule 2: Search for Specific Symbols, Not Understanding

"Understanding the codebase" is not a search goal. Specific symbols are.

code

# Vague (leads to over-reading):
"Let me understand how authentication works"
→ Reading 5 files × 300 lines = 1,500 lines loaded

# Specific (targeted search):
"Find where verifyJWT is called"
Grep(pattern="verifyJWT", output_mode="files_with_matches")
→ Returns 2 files → read only the relevant function in each

Before starting any task, identify the specific symbol, function name, or error message you need to locate.

Rule 3: Build a Symbol Map Before Implementation

For any task touching more than one file, spend 5 searches building a mental map before touching any code.

Symbol map protocol:

Find the entry point (route handler, component, or function the task relates to)
Find the data types it uses
Find where those types are defined
Find the test file for the target
Grep for any existing similar implementation

Total: 5 targeted searches, ~200 tokens. Far cheaper than reading 3–5 files in full.

Rule 4: Use Glob to Find Files First

When you don't know which file contains what you need, glob before grepping.

code

# Step 1: Find candidate files
Glob(pattern="**/auth*.ts")
# Returns: auth.ts, auth-helpers.ts, auth-middleware.ts

# Step 2: Grep the specific symbol in candidates only
Grep(pattern="refreshToken", path="/src/lib/supabase/auth-helpers.ts")
# Returns: exact line range

# Step 3: Read only that range

Rule 5: Read Targeted Line Ranges When Structure is Known

Once you know approximately where something is, use offset+limit to load only what you need.

code

# Good pattern:
# You know the function starts around line 80 from the Grep result
Read(file_path="route.ts", offset=78, limit=40)

# Bad pattern (same file, 350 lines long):
Read(file_path="route.ts")  # loads 310 unnecessary lines

Heuristics for range sizing:

A typical function: 20–50 lines → limit=60 to be safe
A route handler: 50–100 lines → limit=120
A class: varies — Grep for method names first, then read each

Rule 6: Token Budget Awareness

Before loading any file, estimate context cost:

File size	Token cost	Load it?
< 50 lines	~200 tokens	Yes, always
50–200 lines	200–800 tokens	Yes if relevant
200–500 lines	800–2,000 tokens	Only targeted range
500–2,000 lines	2,000–8,000 tokens	Grep first, always
> 2,000 lines	> 8,000 tokens	Never load in full

For files > 500 lines, full reads are almost never justified. There is always a more targeted search.

Search Patterns Reference

Finding a function definition

code

Grep(pattern="function getUserById|getUserById = |getUserById:", type="ts")

Finding all callers of a function

code

Grep(pattern="getUserById\(", output_mode="files_with_matches")

Finding a type/interface

code

Grep(pattern="^(export )?(interface|type) UserProfile", type="ts")

Finding all usages of a constant

code

Grep(pattern="SKILLS_COLUMNS", output_mode="content", -C=2)

Finding a test for a specific route

code

Glob(pattern="tests/**/*tips*.test.ts")

Finding where an error is thrown

code

Grep(pattern="'Skill not found'", output_mode="content")

Anti-Patterns to Avoid

"Let me read the whole project structure first" → Waste. Read structure only for the specific area you're modifying.

Reading a file "to understand" then not using 80% of it → Stop. Define what you need before you load anything.

Re-reading files you already searched → Keep a mental or written note of what each file contains. Don't reload.

Grepping for a concept, not a symbol → Grep(pattern="authentication") returns 200 hits. Grep(pattern="verifyJWT\(") returns 3.

Loading test files to understand production code → Test files describe behavior but are noisy for understanding structure. Use production code + types first.

Session Start Ritual

For any non-trivial coding task, before writing code:

code

1. State the specific goal in one sentence
2. Identify the 1–3 files most likely affected
3. Grep for the specific entry point or function
4. Read only the relevant range
5. Identify any types or helpers needed (glob/grep those too)
6. Now write code

Total search investment: 5–10 targeted operations, ~300–500 tokens. This prevents the 3,000-token "read everything first" approach.

Search Before Read

Search Before Read

Overview

Core Protocol

Rule 1: Never Read a File Over 200 Lines Without Searching First

Rule 2: Search for Specific Symbols, Not Understanding

Rule 3: Build a Symbol Map Before Implementation

Rule 4: Use Glob to Find Files First

Rule 5: Read Targeted Line Ranges When Structure is Known

Rule 6: Token Budget Awareness

Search Patterns Reference

Finding a function definition

Finding all callers of a function

Finding a type/interface

Finding all usages of a constant

Finding a test for a specific route

Finding where an error is thrown

Anti-Patterns to Avoid

Session Start Ritual

Created by

Info

Demo

Embed

Export