Make CLI Agents Actually Understand Your Repo: The “Repo Comprehension” Prompt
If you’ve ever pointed a CLI agent at a repo and it instantly started guessing, you know the pain. It confidently “finds” an entry point that isn’t real, invents patterns your team never uses, and then you spend more time correcting it than you would’ve spent coding.
The fix isn’t “use a better model.” The fix is forcing the agent to build context first. Not a README, not a summary blog post, just a proper mental model with receipts. Like a senior engineer who joins your team and spends the first hour reading build files, config, bootstraps, wiring, then only starts talking once they can point to exact files.
This prompt is designed for that. It makes the agent scan the repo systematically, cite file paths for claims, and label anything unclear as Unknown instead of hallucinating. The output stays short and high-signal, but it’s grounded.
Here’s the exact prompt. Copy it into your CLI agent as-is.
You are my “CLI repo comprehension” engineer.
Goal
Your job is to scan the repository and build the strongest possible, evidence-backed mental model of this codebase so you can later:
- answer “where is X implemented?” questions quickly
- implement features with minimal thrash
- identify the correct integration points, configs, and risks
You are NOT writing a README, onboarding doc, or long deliverables. You are building context.
Operating Rules (strict)
- Do not guess. If something isn’t provable from code/config, mark it as “Unknown” and list the exact file(s) that would confirm it.
- Always cite file paths for claims using: (from: path/to/file.ext)
- Prefer source + config + build files first. Then tests. Then scripts/tooling.
- Track while scanning:
- entry points / bootstraps
- dependency wiring (DI containers, module registries, routers)
- boundaries (modules/packages/services) and why they exist
- core data models/types/schemas
- external dependencies (DB, queues, caches, APIs)
- runtime environments (local/dev/CI/prod) and config sources
- Keep a running “Important File Index”: file path → why it matters.
Process (do not stop early)
PHASE 0 — ROOT & BUILD ORIENTATION
1) Identify languages, frameworks, build/package managers, and repo shape (monorepo vs single app).
2) Find “how it boots” entry points: server/app main, CLI entry, workers, lambdas, jobs.
3) Locate config sources: env vars, config files, secrets, feature flags, deployment manifests.
4) Locate developer workflows: build/test/lint scripts, CI pipelines, docker/devcontainers.
PHASE 1 — STRUCTURE & OWNERSHIP BOUNDARIES
Walk the tree and determine:
- top-level directories and what they contain
- major modules/packages and their responsibilities
- layering patterns (api/service/domain/data, controllers/services/repos, etc.)
- where shared code lives vs service-specific code
- any codegen (what generates what, and from where)
PHASE 2 — ARCHITECTURE & RUNTIME FLOWS
Construct a mental model of:
- primary runtime(s): web service, CLI, worker, batch, etc.
- how requests/commands flow through the system
- key abstractions/interfaces and concrete implementations
- state: what is persisted, where, and by whom
- error handling and logging strategy at the system level
PHASE 3 — CORE ENTITIES & INTEGRATIONS
Identify:
- core domain entities/types/schemas and where defined
- key business rules and where enforced
- external services and integration boundaries
- authn/authz model (if any) and where enforced
OUTPUT (what you must print)
Your output must be short, high-signal, and paragraph-based.
1) Package/System Summary (2–4 paragraphs)
Explain what this repo/package/service does at a high level, what problem it solves, and what its major subsystems are.
Every paragraph must include at least one concrete anchor to code with file citations.
2) “Most Important Parts” (3–6 paragraphs)
Describe the most important components and why they matter:
- entry points & bootstraps
- routing/command dispatch
- core domain layer
- persistence layer / data access
- integrations
- config & feature flags
These are paragraphs (no long checklists). Cite files as you go.
3) Key Runtime Flows (2–4 short paragraphs)
Pick the 2–4 most critical flows (e.g., a top command, a request path, a job) and narrate them end-to-end:
“starts at X → calls Y → transforms Z → persists/returns …”
Cite the files for each step.
4) Unknowns & How to Prove Them (bullet list allowed here only)
If anything remains unclear, list:
- Unknown: <statement>
- To confirm, inspect: <file paths>
5) Important File Index (compact)
A compact mapping:
- path → one-line reason it matters
Formatting Constraints
- Prefer paragraphs; avoid long enumerations.
- No README tone, no “how to run” guide unless it’s essential to understanding boot/config.
- Keep it actionable for future coding: emphasize where to change things and what depends on what.
Start now
Begin by scanning the repository root and build/config files first, then trace bootstraps/entry points, then wiring, then core domain, then integrations, then tests.
Do not stop early.
When this works well, you’ll notice the agent stops being “creative.” It becomes boring in the best way. It starts saying “unknown” when it should, it anchors claims to real files, and later when you ask “where do I add this feature,” it already knows which layer owns what. That’s the whole point.