Find similar patients by symptom embedding

Use SIMILAR_TO over symptom embeddings on patient nodes to surface clinical lookalikes

All recipes· graph· 6 minutesbeginnercypher

Find similar patients by symptom embedding

Objective

Clinicians ask "what worked for patients like this one?" all the time. ICD-10 codes match too coarsely (every diabetic looks the same) and the free-text presentation is what carries the actual signal. With a symptom_embedding on each patient node, SIMILAR_TO returns a cohort of clinically similar patients and Cypher walks to their treatments and outcomes. The wow moment: a single MATCH returns "patients with similar presentations and what they responded to."

Step 1: Set up patients, treatments, outcomes

MERGE (p1:Patient {mrn: "MRN-201", age: 67, sex: "F",
       presentation: "fatigue, polyuria, weight loss over 3 months, fasting glucose 240",
       symptom_embedding: [0.61, 0.18, -0.04, 0.42, 0.21]})
MERGE (p2:Patient {mrn: "MRN-202", age: 58, sex: "M",
       presentation: "polyuria, polydipsia, weight loss, A1c 11.4, BMI 32",
       symptom_embedding: [0.62, 0.17, -0.03, 0.43, 0.22]})
MERGE (p3:Patient {mrn: "MRN-203", age: 71, sex: "F",
       presentation: "intermittent claudication, pulses absent in left foot, smoker",
       symptom_embedding: [-0.31, 0.55, 0.10, -0.04, 0.16]})
MERGE (p4:Patient {mrn: "MRN-204", age: 49, sex: "M",
       presentation: "fatigue, polyuria, fasting glucose 220, family history of diabetes",
       symptom_embedding: [0.60, 0.19, -0.05, 0.41, 0.20]})
MERGE (p5:Patient {mrn: "MRN-205", age: 62, sex: "F",
       presentation: "headache, blurred vision, BP 198/110, no neuro deficit",
       symptom_embedding: [0.04, -0.41, 0.62, 0.10, 0.18]})
MERGE (p6:Patient {mrn: "MRN-206", age: 68, sex: "M",
       presentation: "fatigue, weight loss, polyuria, A1c 12.1, microalbuminuria",
       symptom_embedding: [0.59, 0.20, -0.06, 0.42, 0.19]})

MERGE (rx1:Treatment {name: "Metformin 1000mg BID + lifestyle"})
MERGE (rx2:Treatment {name: "Insulin glargine + metformin"})
MERGE (rx3:Treatment {name: "Cilostazol + smoking cessation referral"})
MERGE (rx4:Treatment {name: "Empagliflozin + ACE inhibitor"})
MERGE (rx5:Treatment {name: "Lisinopril + amlodipine"})

MERGE (out1:Outcome {label: "A1c < 7 at 6 months", positive: true})
MERGE (out2:Outcome {label: "Glycemic control inadequate", positive: false})
MERGE (out3:Outcome {label: "Symptom resolution", positive: true})
MERGE (out4:Outcome {label: "Hospital readmission", positive: false})
MERGE (out5:Outcome {label: "BP < 140/90 at 3 months", positive: true})

MERGE (p1)-[:RECEIVED]->(rx1)-[:RESULTED_IN]->(out1)
MERGE (p2)-[:RECEIVED]->(rx2)-[:RESULTED_IN]->(out1)
MERGE (p4)-[:RECEIVED]->(rx1)-[:RESULTED_IN]->(out2)
MERGE (p6)-[:RECEIVED]->(rx4)-[:RESULTED_IN]->(out1)
MERGE (p3)-[:RECEIVED]->(rx3)-[:RESULTED_IN]->(out3)
MERGE (p5)-[:RECEIVED]->(rx5)-[:RESULTED_IN]->(out5);

Step 2: Find similar patients to a new admission

// New admission p1 — surface clinically similar past patients and what worked.
MATCH (target:Patient {mrn: "MRN-201"})-[:SIMILAR_TO > 0.85]->(similar:Patient)
MATCH (similar)-[:RECEIVED]->(rx:Treatment)-[:RESULTED_IN]->(o:Outcome)
RETURN similar.mrn      AS mrn,
       similar.presentation AS presentation,
       rx.name          AS treatment,
       o.label          AS outcome,
       o.positive       AS was_positive
ORDER BY was_positive DESC;

What's happening

  • The symptom_embedding is built from presentation text — embedded once on admission and stored on the node. Re-embedding when notes change is one update.
  • SIMILAR_TO > 0.85 returns cohort members with semantically similar presentations across age and sex differences — the embedding generalises better than ICD-10 buckets.
  • Walking to :RECEIVED and :RESULTED_IN chains the cohort to outcome data without an extra query. The clinician sees "patients like this one received metformin and reached A1c < 7" or "patients like this one were re-hospitalized" all in one row.
  • Outcome polarity (positive: true/false) lets the UI flag treatments to favor or avoid.
  • Same primitive supports clinical-trial cohort discovery, post-market surveillance, and case conferences — every "patients like this" workflow is one Cypher query away.

Try this next

// Treatments that consistently produce positive outcomes for a cohort.
MATCH (target:Patient {mrn: "MRN-201"})-[:SIMILAR_TO > 0.85]->(s:Patient)
MATCH (s)-[:RECEIVED]->(rx:Treatment)-[:RESULTED_IN]->(o:Outcome {positive: true})
WITH rx, count(s) AS positive_uses
RETURN rx.name AS treatment, positive_uses
ORDER BY positive_uses DESC;
// Cohort discovery for a registry: every patient near a seed presentation.
MATCH (seed:Patient {mrn: "MRN-202"})-[:SIMILAR_TO > 0.8]->(member:Patient)
RETURN member.mrn, member.age, member.presentation;
// Outliers: patients with no clinically similar peers in the cohort.
MATCH (p:Patient)
OPTIONAL MATCH (p)-[:SIMILAR_TO > 0.8]->(near:Patient)
WITH p, count(near) AS peers
WHERE peers = 0
RETURN p.mrn, p.presentation AS unique_presentation;

Tags

graphcyphersimilar_tohealthcarebeginner

Run this on your own machine

Install SynapCores Community Edition free, paste the SQL or Cypher above into the bundled web UI, and watch it run.

Download Free CE