← All Posts
2026-04-15

Part 1: Perception — How Agents See Code

agents · architecture · perception · tree-sitter · series

*This is Part 1 of a six-part series on the layers of an autonomous coding agent. Start with the introduction.*

---

The question no one asks

When an AI agent "reads" your codebase, what does it actually see?

This question gets skipped a lot. The common mental model is that the agent opens files, reads the text, and somehow understands what's there. In practice, that's almost exactly what most agents do — they read files as sequences of lines, stuff those lines into a context window, and hope the model underneath can figure out what matters.

It works, sort of, for small tasks. It collapses quickly for anything non-trivial. And the failure mode is usually misdiagnosed: people blame the model's reasoning when the real problem is that the model never had anything coherent to reason about.

Perception is the first layer of a serious agent architecture. Get it wrong and everything above it compounds the error.

The line-based default

Most coding agents in production right now use some version of this pipeline: read a file as text, optionally chunk it by token count, embed the chunks, retrieve relevant chunks based on similarity to the current task, and paste those chunks into a prompt.

This is a search engine wearing a programming hat. It treats code as prose. A function is a run of lines between a def or function keyword and the next blank line. A dependency is a word that appeared near another word somewhere in the file. A class is a block of text with indentation.

It works for surface-level tasks. "Write a function that does X" doesn't require the agent to understand the codebase — it just needs to produce plausible code in roughly the right style. "Explain what this file does" is mostly a summarization task; line-based reading handles it fine.

The cracks show up when the agent needs to reason about structure. Change this function's signature and update every caller. Trace what happens when a user clicks this button. Find every place that reads from this database column and evaluate the impact of removing it. These tasks aren't hard in principle — they're hard because line-based perception has no way to see the structural relationships that make them tractable.

When a line-based agent fails these tasks, it usually fails silently. It edits the function, misses two callers, and hands back work that looks complete but compiles to something broken. This is the quiet failure mode that makes coding agents feel unreliable at scale.

What structural understanding means

Structural understanding starts from a different premise: code is not text. Code is text in the way that a contract is text, or a legal brief is text. The words are necessary but they're not the structure. The structure is the entities, the relationships, and the scopes.

Concretely, a structural view of code gives an agent access to:

  • Entities as first-class objects. A function, a class, a variable, an import — each is a thing with a name, a location, a scope, and a set of relationships. The agent can ask "what is this function?" and get back a structured answer instead of a chunk of text.
  • Call graphs. If the agent sees function authenticate_user, it can ask what calls it (to evaluate impact) and what it calls (to evaluate dependencies). Not as a text search — as a traversal of an actual graph.
  • Scope and resolution. When the agent encounters a reference to config, it can ask which config — the one imported at the top of the file, the one passed as a parameter, the one defined in the enclosing class. Name resolution is structural, not textual.
  • Dependency sets. For any given symbol, the agent can trace what needs to change when that symbol changes. This is the difference between "edit this function" and "refactor this function safely."
  • Dead code detection. The agent can see what's defined but never used, what's imported but never referenced, what's exported but never consumed.
  • Typed context windows. Instead of chunking by token count, the agent can request "this function and everything it calls, recursively, up to three levels." The context is scoped to the task, not sized to the transport.

None of this is speculative. It's what happens when you parse code into an abstract syntax tree, index the entities, and expose a query interface over the result.

Structural code intelligence as a category

Structural code understanding is a distinct category of tooling, not a feature of any one product. To be useful as an agent's perception layer, an implementation has to deliver a specific bundle of properties:

  • Multi-language. Agents work across codebases that mix languages. The tooling has to handle them through a uniform grammar framework, not as a series of separate integrations.
  • Query-able. The parse tree has to expose structured queries — calls, references, symbols, scopes — not just a tree the caller has to walk manually.
  • Incremental. Reparsing a full file on every edit doesn't scale. The tooling has to update its index incrementally as code changes.
  • Portable. Structural perception belongs in every tool that touches code. That means no CGo dependencies, no per-platform build matrix, no installation story that breaks when the target environment isn't what the tooling assumed.
  • Fast enough to be invisible. If querying the index is slower than running grep, agents will use grep. The perception layer has to be a primitive, not a service call.

Tree-sitter is the foundation most serious implementations build on — an incremental parser generator that handles hundreds of programming languages through a uniform grammar framework. What matters for an agent architecture is what gets layered on top.

The leading implementation we've found in this category is a pure-Go runtime with 206 grammars embedded in the binary — no CGo, no external toolchain, the same artifact running on Linux, macOS, Windows, and WebAssembly. A toolkit on top of it exposes the queries agents actually need: call graphs, reference resolution, scope analysis, dead code detection, chunking with token budgets. When an agent needs to know what authenticate_user does and who uses it, it doesn't grep — it queries. The result is structured, typed, and precise. The repositories are on GitHub.

Naming the category matters because the choice isn't really between tools. It's between perception paradigms — text-based or structural. Once you've made that choice, the implementation follows.

---

Reading this because you're trying to build?

>

For custom architecture and consulting, work with ResonanceWorks — Talk to Consulting. For a ready-made install, start with Torque Engineering.

---

What this changes

The capabilities that become tractable with structural perception are not the flashy ones. They're the boring, essential ones that make autonomous systems trustworthy.

Reliable refactoring. Rename a function across a codebase and every reference gets updated. Not because the agent is smart — because the agent can enumerate every reference and change them programmatically. This is something IDEs have done for decades, but most coding agents still approximate it with search-and-replace.

Impact analysis that actually reflects impact. When the agent plans a change, it can compute the blast radius: every function that calls this one, every test that exercises it, every file that imports it. The plan includes the full surface area of the change, not just the line edits.

Accurate context retrieval. Instead of fetching text chunks that contain keywords similar to the task, the agent fetches the functions, classes, and symbols that are structurally relevant to the task. The context window carries information that matters instead of text that happens to match.

Dead code and unused imports. An agent that can see what's defined but never referenced can clean up a codebase as part of its normal work. The before/after state is verifiable — either the symbol has references or it doesn't.

Safer code generation. When the agent generates new code, it can validate that the symbols it references actually exist, that the types align, that the imports resolve. The difference between plausible code and compiling code shows up at the structural level, not the textual one.

Better token economics. This one is underrated. Structural context is more information-dense than textual context. An agent that can fit "this function's full dependency graph, three levels deep" into 4,000 tokens can reason about things that a line-based agent would need 40,000 tokens to approximate. The compression factor is roughly 10x in our experience — and that's what makes longer-horizon tasks possible on reasonable budgets.

Where perception fails

Being honest about this layer's limits matters, because they constrain what the layers above can do.

Parse errors. Tree-sitter parsers are error-tolerant, but they're not magical. Syntactically broken code produces partial trees with gaps. An agent needs to detect these cases and know that its perception is incomplete.

Cross-language boundaries. A TypeScript file that imports a compiled Python extension via a Rust wrapper is still one call graph, but the tooling to see it as one is rarely available. Agents that work across language boundaries tend to fall back to textual reasoning at the seams.

Semantic vs syntactic. The parser knows authenticate_user calls check_password. It doesn't know that check_password has security-critical properties that constrain how authenticate_user should behave. Semantic understanding is layered on top of syntactic understanding, not delivered by it. An agent with structural perception still needs domain knowledge to reason well.

Runtime behavior. Call graphs derived from static analysis miss dynamic dispatch, reflection, metaprogramming, and anything else that happens at runtime. An agent that assumes the static graph is the complete graph will be wrong in languages where it isn't.

Index freshness. A structural index has to stay in sync with the code. Stale indexes produce wrong answers confidently — one of the worse failure modes. Incremental reparsing helps, but someone has to own keeping the index current.

These aren't reasons to avoid structural perception. They're reasons to build the layer carefully and to give the layers above it a way to know when perception is incomplete.

What it enables for the layers above

Each layer in an agent architecture depends on the layer below. Planning with structural perception looks nothing like planning with line-based perception. The agent can decompose work by entity — "modify function A, update callers, add tests covering the new behavior" — instead of by vague scope — "edit the authentication logic." The plan is concrete because the perception is concrete.

Execution gets the same benefit. An agent that generates code against a structural view can validate its output before committing: do the imports resolve, do the types line up, are there dangling references. This moves entire classes of errors from "discovered at runtime" to "caught before the agent ever declares the task done."

Memory, which we'll cover in Part 4, benefits from structural embeddings. A vector that represents "this function and its signature" is more useful than a vector that represents "this 200-line blob of text that happens to contain this function." The retrieval becomes intentional instead of lucky.

Governance, which we'll cover in Part 5, gets a cleaner target. A rule like "don't delete files" is hard to enforce over text diffs. A rule like "don't delete any exported symbol that has external references" is straightforward to enforce over structural diffs.

Perception is the foundation. Every improvement in the layers above compounds through it. Every weakness in the layers above eventually traces back to it.

Next: Planning

In Part 2 we look at the planning layer — how an agent turns a task into an executable sequence of steps, where decomposition succeeds, and why "good execution of a wrong plan" is the single most common failure mode in autonomous systems.

---

Get the rest in your inbox

The series lands roughly every week or two. Six parts in total, each standing alone and building on the last. Subscribe to get new parts the day they ship, plus occasional technical notes on what we're learning from running these systems in production.

Need custom help designing your stack?

ResonanceWorks works with founders and small teams on architecture, governance, and private AI system design. We take a small number of engagements at a time and work closely with founders and technical leads. Talk to Consulting.

Want a ready-made local-first system instead?

Torque Engineering installs a proven capture, vault, and routing system for independent operators. Explore Torque Engineering.