How I built a code knowledge graph with FalkorDB and tree-sitter

Six months ago I asked a simple question: can you query a codebase the way you query a database?

Not “find all files that import X” — that’s grep. I mean: “what breaks if I change this function?” or “does this implementation actually satisfy this spec section?”

That question became Loom.

The core idea

Most code intelligence tools treat a codebase as a flat file system. They parse syntax, build call graphs, and stop there. What they miss is the relationship between code and the documentation that describes it.

Loom builds a property graph with two kinds of nodes — code symbols (functions, classes, modules) and documentation nodes (README sections, spec headings, ADRs) — and connects them with typed edges:

IMPLEMENTS — this function fulfils this spec section
SPECIFIES — this doc node describes this symbol
VIOLATES — detected drift between implementation and spec

That third edge type is the interesting one.

Why FalkorDB

I evaluated Neo4j, Memgraph, and FalkorDB. FalkorDB won for three reasons:

Redis-native — the graph lives in memory, queries are fast, persistence is handled by Redis AOF/RDB. No separate server.
Cypher — standard query language, no lock-in.
Single-binary deploy — no cluster to manage for a self-hosted tool.

The tradeoff is that FalkorDB is less battle-tested than Neo4j for very large graphs. For Loom’s target (single-repo, tens of thousands of nodes), it’s more than sufficient.

Parsing with tree-sitter

Tree-sitter gives you a concrete syntax tree for 40+ languages in a single Python library. The call tracer walks the tree and extracts:

Function definitions and their docstrings
Call expressions (who calls whom)
Import statements (module dependency graph)

def trace_calls_for_py_file(file_path: Path) -> list[CallEdge]:
    with open(file_path) as f:
        source = f.read()
    tree = PY_PARSER.parse(source.encode())
    visitor = CallVisitor(file_path)
    visitor.visit(tree.root_node)
    return visitor.edges

The hardest part was getting call resolution right across files — tree-sitter gives you the AST, not the symbol table. We resolve symbols by building a module map first, then doing a second pass to link call sites to their definitions.

The semantic linker

This is where it gets interesting. The semantic linker takes all code nodes and all doc nodes, embeds them with nomic-embed-text, and creates IMPLEMENTS edges for pairs that score above a threshold in cosine similarity.

It sounds blunt, but in practice it works well. A function called calculate_drift that has a docstring mentioning “spec compliance” will correctly link to the documentation section about drift detection — without any manual annotation.

False positives are filtered by a second pass that checks structural signals: does the function signature match the parameters described in the doc? Does it appear in the same module the doc refers to?

What’s next

Loom v0.1 is live on GitHub. The next milestone is a hosted Cortex tier — a managed MCP server that gives your Claude/Cursor session live access to the graph without running anything locally.

If you’re building on top of codebases with any meaningful complexity, give it a try. I’m actively looking for feedback from teams using it in real repos.