Drug repurposing via mechanism similarity

Objective

Repurposing an approved drug to a new indication takes 3-5 years vs 12 for a novel molecule. The search problem is "find drugs whose mechanism resembles a known winner's, that act on the same target family." Modeling drugs, targets, and diseases as a graph, with mechanism_embedding on each drug node, lets SIMILAR_TO find mechanism-similar candidates in one hop. The wow moment: a single Cypher query returns "drugs not currently indicated for X but whose mechanism resembles a drug that works for X, ordered by similarity."

Step 1: Build the small drug-target-disease graph

MERGE (d1:Drug {name: "Metformin", indication: "Type 2 Diabetes",
       mechanism: "Activates AMPK, reduces hepatic glucose output",
       embedding: [0.61, 0.18, -0.04, 0.42, 0.21]})
MERGE (d2:Drug {name: "Phenformin", indication: "Withdrawn",
       mechanism: "Activates AMPK pathway, increases insulin sensitivity",
       embedding: [0.62, 0.17, -0.03, 0.43, 0.22]})
MERGE (d3:Drug {name: "Berberine", indication: "Diarrhea",
       mechanism: "Plant alkaloid, activates AMPK, lowers blood glucose",
       embedding: [0.60, 0.19, -0.05, 0.41, 0.20]})
MERGE (d4:Drug {name: "Sitagliptin", indication: "Type 2 Diabetes",
       mechanism: "DPP-4 inhibitor, prolongs incretin action",
       embedding: [-0.21, 0.55, 0.10, 0.05, -0.30]})
MERGE (d5:Drug {name: "Linagliptin", indication: "Type 2 Diabetes",
       mechanism: "DPP-4 inhibitor, similar to sitagliptin",
       embedding: [-0.20, 0.56, 0.11, 0.04, -0.31]})
MERGE (d6:Drug {name: "Atorvastatin", indication: "Hyperlipidemia",
       mechanism: "HMG-CoA reductase inhibitor, lowers LDL",
       embedding: [-0.55, -0.08, 0.41, -0.12, 0.07]})
MERGE (d7:Drug {name: "Empagliflozin", indication: "Type 2 Diabetes",
       mechanism: "SGLT2 inhibitor, increases urinary glucose excretion",
       embedding: [0.04, -0.41, 0.62, 0.10, 0.18]})

// Targets
MERGE (t1:Target {name: "AMPK"})
MERGE (t2:Target {name: "DPP-4"})
MERGE (t3:Target {name: "HMG-CoA reductase"})
MERGE (t4:Target {name: "SGLT2"})

// Diseases
MERGE (dz1:Disease {name: "Type 2 Diabetes"})
MERGE (dz2:Disease {name: "Polycystic Ovary Syndrome"})
MERGE (dz3:Disease {name: "Cancer (multiple solid tumors)"})

MERGE (d1)-[:ACTS_ON]->(t1)
MERGE (d2)-[:ACTS_ON]->(t1)
MERGE (d3)-[:ACTS_ON]->(t1)
MERGE (d4)-[:ACTS_ON]->(t2)
MERGE (d5)-[:ACTS_ON]->(t2)
MERGE (d6)-[:ACTS_ON]->(t3)
MERGE (d7)-[:ACTS_ON]->(t4)

MERGE (t1)-[:IMPLICATED_IN]->(dz1)
MERGE (t1)-[:IMPLICATED_IN]->(dz2)
MERGE (t1)-[:IMPLICATED_IN]->(dz3)
MERGE (t2)-[:IMPLICATED_IN]->(dz1)
MERGE (t4)-[:IMPLICATED_IN]->(dz1)

MERGE (d1)-[:TREATS]->(dz1)
MERGE (d4)-[:TREATS]->(dz1)
MERGE (d5)-[:TREATS]->(dz1)
MERGE (d7)-[:TREATS]->(dz1);

Step 2: Find candidates for repurposing

// Strategy: take a drug that already TREATS a disease via a target.
// Find OTHER drugs whose mechanism is similar AND that act on the same target,
// but are NOT yet indicated for that disease — repurposing candidates.
MATCH (winner:Drug {name: "Metformin"})-[:ACTS_ON]->(t:Target)
MATCH (winner)-[:SIMILAR_TO > 0.9]->(candidate:Drug)
MATCH (candidate)-[:ACTS_ON]->(t)
WHERE NOT (candidate)-[:TREATS]->(:Disease {name: "Type 2 Diabetes"})
RETURN candidate.name      AS repurpose_candidate,
       candidate.indication AS current_indication,
       t.name              AS shared_target,
       candidate.mechanism AS mechanism;

Step 3: Reach all diseases the shared target is implicated in

// Broaden: which diseases could this candidate plausibly help with, given its target?
MATCH (winner:Drug {name: "Metformin"})-[:SIMILAR_TO > 0.9]->(candidate:Drug)
MATCH (candidate)-[:ACTS_ON]->(t:Target)-[:IMPLICATED_IN]->(disease:Disease)
WHERE NOT (candidate)-[:TREATS]->(disease)
RETURN candidate.name AS candidate,
       t.name        AS target,
       disease.name  AS plausible_indication;

What's happening

SIMILAR_TO > 0.9 over mechanism embeddings clusters Metformin, Phenformin, and Berberine — three structurally-different molecules that all activate AMPK. Vector similarity sees the semantic equivalence prose-based search misses ("plant alkaloid" vs "biguanide").
The structural pattern (:Drug)-[:ACTS_ON]->(:Target)-[:IMPLICATED_IN]->(:Disease) ensures candidates have a plausible biological pathway, not just a similar description.
NOT (candidate)-[:TREATS]->(disease) is the key filter: only return drugs that aren't already indicated. This is the actual repurposing question.
Real pipelines use the same primitives: target overlap (graph) + chemoinformatic similarity (vector) + literature evidence (extracted with /v2/graph/extract). All three live in one engine here.
This is the kind of query Eli Lilly and Roche pay seven figures for from Hetionet/PrimeKG-style knowledge graphs — at this scale you can prototype it on a laptop.

Try this next

// Drugs targeting the same disease but via different targets (combination-therapy candidates).
MATCH (d1:Drug)-[:TREATS]->(dz:Disease {name: "Type 2 Diabetes"})<-[:TREATS]-(d2:Drug)
MATCH (d1)-[:ACTS_ON]->(t1:Target),
      (d2)-[:ACTS_ON]->(t2:Target)
WHERE t1 <> t2 AND id(d1) < id(d2)
RETURN d1.name AS drug_a, t1.name AS target_a,
       d2.name AS drug_b, t2.name AS target_b;

// Reverse search: given a target, what mechanism-similar drugs exist beyond the obvious?
MATCH (t:Target {name: "AMPK"})<-[:ACTS_ON]-(d:Drug)
MATCH (d)-[:SIMILAR_TO > 0.85]->(near:Drug)
WHERE NOT (near)-[:ACTS_ON]->(t)
RETURN d.name AS known, near.name AS off_target_lookalike, near.indication;

MATCH (d:Drug)-[:ACTS_ON]->(t:Target)-[:IMPLICATED_IN]->(dz:Disease)
RETURN dz.name AS disease, count(DISTINCT d) AS candidate_drugs
ORDER BY candidate_drugs DESC;

Drug repurposing via mechanism similarity

Drug repurposing via mechanism similarity

Objective

Step 1: Build the small drug-target-disease graph

Step 2: Find candidates for repurposing

Step 3: Reach all diseases the shared target is implicated in

What's happening

Try this next

Run this on your own machine