A few weeks ago I wrote about conversation memory with a rolling summary — two tables, store every turn, keep one running summary, feed the model summary + recent turns. That's the floor of decent chatbot memory. Production-ready for most cases, way better than re-sending the whole transcript.
But that post left out something important, and I want to fix that now. The single rolling summary is structurally the lower-recall option for chatbots whose conversations have durable structure — preferences, constraints, goals, the things a user states explicitly and expects the bot to remember.
This isn't my opinion. It's in LangChain's own docs:
A collection of narrow, individually-updatable memory documents yields higher recall than a single continuously-updated profile/summary — because it's easier for an LLM to generate new objects than reconcile new information with an existing profile.
If your bot is helping users plan a trip, manage a diet, configure a SaaS account, take a support ticket — anything where they state facts the bot is supposed to act on later — a single prose summary is the wrong shape. You'll watch contradictions stack up inside the paragraph until someone notices.
What goes wrong with a single summary
Take the trip-planning example from the original post. User says "I'm vegetarian" in turn 5. Twenty turns later they say "actually I eat fish now, I'm pescatarian." Your rolling summary, depending on when you last regenerated it, ends up with one of three failure modes:
- Old belief sticks — the latest regeneration prompted on the full transcript happened to compress over the new statement. Bot still thinks vegetarian.
- Both beliefs coexist — the summary now reads "the user is vegetarian and also eats fish." The model picks one when generating the next reply. You don't know which.
- Silent drift — every regeneration loses a little fidelity; six turns later "pescatarian" became "flexible about diet" because the model interpreted, generalized, and re-summarized.
You can't SELECT * FROM a paragraph to find out what the bot believes. You can't write a test against it. You can't show the user "here's what I have about you, fix anything wrong." The summary hides all of that inside one prose blob.
The next-step pattern: typed facts
Same shape as the original — two tables — but the second table changes:
CREATE TABLE IF NOT EXISTS recipe_chat_facts (
fact_id INTEGER PRIMARY KEY,
session_id TEXT,
category TEXT, -- 'dietary', 'budget', 'destination', 'health', 'goal'
fact_key TEXT, -- 'pescatarian', 'max_usd', 'cities', 'motion_sick', 'trip_length_days'
fact_value TEXT, -- 'true', '3000', 'Tokyo,Kyoto', 'true', '10'
confidence REAL, -- 0.0–1.0
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
embedding VECTOR(384)
);
One row per durable attribute. The (session_id, category, fact_key) triple is the natural key — the runtime UPSERTs on that, never appends. So when "I eat fish now" happens:
UPDATE recipe_chat_facts
SET fact_key = 'pescatarian',
fact_value = 'true',
confidence = 0.95,
updated_at = CURRENT_TIMESTAMP,
embedding = EMBED('dietary: pescatarian=true')
WHERE session_id = 's1' AND category = 'dietary';
The old vegetarian belief is gone. Not summarized over. Not coexisting. Gone. The table has one dietary row.
That's the win. Surgical updates instead of compression cycles.
Three reasons this is the next step, not the only step
I'm careful not to oversell. Let me be honest about where this pattern is and isn't right.
Reason it's the next step: higher recall on the things users explicitly state, observable beliefs (you can SELECT * FROM them and read out loud what the bot remembers), no contradiction accumulation, surgical updates.
Reason it's not the only step: it requires the conversation to have durable structure to extract. If your chatbot is mostly small talk, brainstorming, or open-ended exploration, there's nothing typed to land in a fact table — and the rolling-summary recipe works fine.
Reason to ship both: the recent-turn semantic recall from the original recipe (Step 5 — cosine top-k over the last N turns) is still the right anchor for "what specifically was the user asking about a minute ago." The fact collection handles durable belief. The two layers are complementary, not competing. Production deployments use both.
Most teams I've seen flip between the two as if they're a choice. They're not. They're layers.
What the runtime actually does
In a production agent loop, the fact-extraction step isn't a SQL prompt the user runs — it's a tool call the runtime makes after every (or every few) user turns:
- User says something.
- Runtime appends the turn (
recipe_chat_turns). - Runtime asks the model: "given the last N turns, has any durable fact changed? Output CSV:
category,fact_key,fact_value,confidence. If nothing changed, output empty." - Runtime parses the CSV and runs UPSERTs against
recipe_chat_factskeyed on(session_id, category, fact_key). - Next reply uses
facts + last 2 turnsas context. No summarization step, no compression cycle.
That's a clean separation: the storage is structured, the extraction is the LLM call that turns prose into rows. The model is good at "did anything change here." It is much worse at "reconcile a 2,000-character paragraph with this new statement and produce a 2,000-character paragraph that gets every detail right." Use it for the thing it's good at.
The whole recipe
The full step-by-step is up as a free recipe — same format as the rolling-summary version, importable into any running SynapCores instance, runs end-to-end in about 16 minutes. It includes the "I eat fish now" payoff moment where you can watch the UPDATE happen in one query and the next reply come out respecting the new belief, not the old one.
👉 Conversation Memory + Structured Facts for a Chatbot
And if you missed the first one, that's still the place to start — get the rolling summary working, then add the fact table on top.
👉 Conversation Memory + Rolling Summary for a Chatbot
Same engine, same SQL, same database-is-the-brain story. Just one more layer when your conversation has things worth typing.