Objective
A research agent has to do more than fetch a passage: it must read a corpus, connect the facts into a knowledge graph, answer questions that span several documents — with citations — and synthesize a briefing. Pure vector RAG can't follow relationships across documents; a graph alone can't read prose. Here you'll build a research agent that does both on one database, composing this cluster's RAG, citation, and knowledge-graph blocks. The same brain drives any framework or a voice agent — see Use it from your agent at the end.
Step 1: Ingest the document corpus (for RAG, with citations)
Each chunk carries a citation handle and an embedding so answers stay traceable.
CREATE TABLE IF NOT EXISTS recipe_research_docs (
chunk_id INTEGER PRIMARY KEY,
doc_ref TEXT,
content TEXT,
embedding VECTOR(384)
);
INSERT INTO recipe_research_docs (chunk_id, doc_ref, content) VALUES
(1,'PaperA §1','GraphRAG combines vector retrieval with graph traversal to answer multi-hop questions.'),
(2,'PaperA §3','In GraphRAG, embeddings select an entry node and edges supply the connected context.'),
(3,'PaperB §2','Citation-grounded answers reduce hallucination by tying each claim to a source.'),
(4,'PaperB §4','Self-checking passes verify a draft answer against its retrieved sources.'),
(5,'PaperC §1','Knowledge graphs encode entities and relations that flat text cannot express.'),
(6,'PaperC §5','Multi-hop reasoning over a graph answers questions no single passage contains.');
UPDATE recipe_research_docs SET embedding = EMBED(content);
Step 2: Retrieve the most relevant passages for a query
The research agent pulls the top sources by meaning for the question it's investigating.
SELECT doc_ref, content,
COSINE_SIMILARITY(embedding, EMBED('how do graphs help answer multi-hop questions?')) AS relevance
FROM recipe_research_docs
ORDER BY relevance DESC
LIMIT 3;
Step 3: Answer with inline citations
Generate an answer that cites the sources it used — auditable research output. First freeze the top-3 retrieved passages with their citation handles, concatenate them into one numbered context string, then generate the cited answer from it.
CREATE TABLE IF NOT EXISTS recipe_research_top (
chunk_id INTEGER PRIMARY KEY,
doc_ref TEXT,
content TEXT,
relevance DOUBLE
);
INSERT INTO recipe_research_top (chunk_id, doc_ref, content, relevance)
SELECT chunk_id, doc_ref, content,
COSINE_SIMILARITY(embedding, EMBED('how do graphs help answer multi-hop questions?')) AS relevance
FROM recipe_research_docs
ORDER BY relevance DESC
LIMIT 3;
CREATE TABLE IF NOT EXISTS recipe_research_ctx (id INTEGER PRIMARY KEY, sources TEXT);
INSERT INTO recipe_research_ctx (id, sources)
SELECT 1, GROUP_CONCAT('(' || doc_ref || ') ' || content, ' ') FROM recipe_research_top;
SELECT GENERATE(
'Answer using ONLY these sources and cite the source name in parentheses inline, e.g. (PaperA §1). Sources: ' ||
sources ||
' Question: How do knowledge graphs help answer multi-hop questions? Answer with citations:') AS cited_answer
FROM recipe_research_ctx;
Step 4: Build the concept knowledge graph (Cypher)
Connect the concepts the corpus discusses so the agent can reason across documents.
MERGE (rag:Concept {name: 'RAG'})
MERGE (graphrag:Concept {name: 'GraphRAG'})
MERGE (kg:Concept {name: 'KnowledgeGraph'})
MERGE (multihop:Concept {name: 'MultiHopReasoning'})
MERGE (cite:Concept {name: 'CitationGrounding'})
MERGE (graphrag)-[:EXTENDS]->(rag)
MERGE (graphrag)-[:USES]->(kg)
MERGE (kg)-[:ENABLES]->(multihop)
MERGE (cite)-[:IMPROVES]->(rag);
Step 5: Multi-hop query — what does GraphRAG ultimately enable?
Walk USES → ENABLES to connect GraphRAG to multi-hop reasoning through the knowledge-graph concept.
MATCH (g:Concept {name: 'GraphRAG'})-[:USES]->(:Concept)-[:ENABLES]->(cap:Concept)
RETURN g.name AS technique, cap.name AS enables;
Step 6: Find every concept that improves the base technique
A relationship query the prose never states in one place.
MATCH (c:Concept)-[:IMPROVES|EXTENDS]->(base:Concept {name: 'RAG'})
RETURN c.name AS related_concept;
Step 7: Synthesize a research briefing (GENERATE)
Combine the whole corpus into a short, structured briefing the agent hands back. Flatten the notes into one context row first, then generate from it.
CREATE TABLE IF NOT EXISTS recipe_research_corpus (id INTEGER PRIMARY KEY, notes TEXT);
INSERT INTO recipe_research_corpus (id, notes)
SELECT 1, GROUP_CONCAT(doc_ref || ': ' || content, ' ') FROM recipe_research_docs;
SELECT GENERATE(
'Write a 3-bullet research briefing on graph-augmented retrieval, based only on these notes: ' || notes) AS briefing
FROM recipe_research_corpus;
Cleanup (Optional)
DROP TABLE IF EXISTS recipe_research_docs;
DROP TABLE IF EXISTS recipe_research_top;
DROP TABLE IF EXISTS recipe_research_ctx;
DROP TABLE IF EXISTS recipe_research_corpus;
MATCH (n:Concept) DETACH DELETE n;
Expected Outcomes
- Step 2 retrieves the graph/multi-hop passages by meaning.
- Step 3 returns an answer with inline
[1]/[2]citations pointing at the right papers. - Step 5 connects GraphRAG → KnowledgeGraph → MultiHopReasoning across two hops.
- Step 6 finds both CitationGrounding (improves) and GraphRAG (extends) as related to RAG — a cross-document relationship.
- Step 7 produces a 3-bullet briefing synthesized from the whole corpus.
You've built a research agent that reads, connects, answers with citations, and synthesizes — RAG and a knowledge graph in one database.
Use it from your agent (framework-agnostic — this is the whole point)
The research brain is just a cited doc index + a concept graph, so any agent shell drives it with no framework lock-in:
- REST / SDK —
POST /v1/query/execute(any language), or@synapcores/sdkclient.executeQuery(...). Your agent ingests sources, retrieves + cites (Steps 2–3), walks the concept graph (Steps 5–6), and synthesizes the briefing (Step 7). - MCP (native, on by default) — point any MCP client (Claude Code, Cursor, a custom loop, a voice runtime) at
ws://<your-instance>/mcp?token=<jwt>(JWT from onePOST /v1/auth/login→access_token). Thequerytool retrieves/cites/synthesizes; theexecutetool runs Cypher for the concept graph — the research loop as tool calls. - Any framework — OpenClaw, LangChain / LlamaIndex research pipelines, a custom loop, or a voice research assistant that reads the briefing aloud all call the same brain. The database is the brain; the framework is swappable.
Key Concepts Learned
- A research agent composes cited RAG (read + answer + trace) with a knowledge graph (connect + reason).
- Vector retrieval answers "what does a passage say"; the graph answers "how do the facts relate."
- One
GENERATE()over the corpus produces a synthesized briefing from everything retrieved. - Because it's plain data ops (SQL + Cypher + GENERATE / REST / MCP), the research agent works from any framework — the agent-agnostic backend pattern this cluster builds on.