RAG workflow: retrieve relevant context, augment the prompt, generate the answer

GraphRAG is the upgrade that plain retrieval-augmented generation has been waiting for. Ordinary RAG embeds a question, finds the most similar chunks, and stuffs them into the prompt. It works until the answer depends on how facts connect rather than on any single chunk — and then it fails quietly, returning plausible context that misses the relationship that mattered. GraphRAG fixes this by adding a graph step: after finding seed chunks by similarity, it walks the relationships outward to pull in connected context the vector search alone would never surface. The catch is that GraphRAG needs two capabilities at once — vector search and graph traversal — and how you provision them is the architectural decision that makes or breaks the project.

This post argues for doing it in one engine. For the conceptual grounding, read What Is GraphRAG? and GraphRAG vs Traditional RAG; for the head-to-head with a graph database, GraphRAG vs Neo4j.

The engine here is SynapCores — vector, graph, SQL, and in-database AutoML in a single self-hosted binary, with native MCP support and an OpenClaw long-term-memory plugin. The Community Edition is free. Download the free Community Edition →

A note on scope. Capabilities marked (Enterprise / roadmap) are part of the Enterprise tier or roadmap and are not in the free Community Edition today. Everything unmarked is in the free CE.

Why plain RAG runs out of road

Vector similarity answers "what text is most like this question?" That is the right question for "what does our refund policy say?" — the answer is in one passage, and similarity finds it. It is the wrong question for "which customers are affected by the outage in the payments service that depends on the database we are migrating?" That answer lives in the connections between entities, and no single chunk contains it. We dig into this failure mode in Why Vector Search Fails Complex Reasoning and Why Vector Search Needs Graph Relationships.

GraphRAG addresses it by treating retrieval as two moves: similarity to find where to start, traversal to gather what connects.

The GraphRAG pattern, step by step

Embed the question. Turn the user's query into a vector.
Find seeds by similarity. Retrieve the top-k chunks or nodes closest to the question vector.
Traverse the graph. Walk N hops out from those seeds along typed edges to collect related entities and chunks.
Assemble and generate. Feed the combined subgraph — seeds plus connected context — into the language model.

Steps two and three are the crux. Step two is a vector workload; step three is a graph workload. A GraphRAG system must do both, on the same data, fast, and keep them consistent.

The dual-store tax

The most common GraphRAG architecture splits those two steps across two systems: a vector database for the embeddings and a graph database for the relationships. It works, and it is the path you land on if you started with one specialist and added the other. But it carries a tax that grows with the system.

Every document now lives in two places — as an embedding in the vector store and as a node in the graph — and the two must be kept aligned. When content changes, both must update together; when they drift, retrieval degrades in ways no unit test catches, because the query still returns something. You also pay the network on every retrieval, twice: a call to the vector store for seeds, then a call to the graph store for expansion, with your application code marshaling IDs between them. At low volume this is invisible. Under production load and constant writes, it is the part of the system that pages someone.

One engine, one retrieval

When the embeddings and the graph share a storage engine, the seed search and the traversal are one statement against one store. There is no ID marshaling across services and no embedding-to-node drift, because the embedding is a property of the node.

-- GraphRAG in a single query: vector seeds, then graph expansion
SELECT n.title, n.content, n.entity_type
FROM GRAPH_TRAVERSE(
       seeds => (
         SELECT id FROM knowledge_nodes
         ORDER BY COSINE_SIMILARITY(embedding, EMBED(:question)) DESC
         LIMIT 6
       ),
       hops => 2,
       edge_types => ['DEPENDS_ON', 'AFFECTS', 'OWNED_BY']
     ) AS n;

One round-trip, one consistency boundary, one system to operate. If you want the model to produce the final answer inside the same engine, an in-database GENERATE() call can take the assembled subgraph and return the response without a separate model-serving hop — keeping the whole retrieve-and-reason loop in one place.

What you give up, honestly

A unified engine does not match a dedicated graph database's depth on pure, enormous traversals, nor a dedicated vector database's last-mile recall tuning at extreme scale. If your GraphRAG corpus is in the billions of nodes and traversal latency at that scale is your single hardest constraint, a specialist graph engine paired with a specialist vector store may still win on that one axis — at the cost of the sync problem. For the large majority of GraphRAG systems, the unified engine's removal of dual-store drift and network hops is the better trade, and a far simpler thing to operate.

Comparison at a glance

Concern	Dual-store GraphRAG	Single-engine GraphRAG
Seed search + traversal	Two systems, app-side glue	One query
Data duplication	Embedding + node copies	Embedding is a node property
Consistency	Sync jobs, silent drift risk	One transaction boundary
Retrieval round-trips	Two network calls per query	One
Operational surface	Vector DB + graph DB	One binary

Build GraphRAG on it for free

The free Community Edition gives you native vector indexing and a graph engine in one binary, installable in about 30 seconds — enough to stand up a real GraphRAG pipeline on your own machine and see the single-query retrieval for yourself.

Download Free → · Read What Is GraphRAG? → · See the live demos →

AI-Native Database for GraphRAG: One Engine for Retrieval That Reasons