Semantic product recommendations with SIMILAR_TO

Objective

Co-purchase recommendations are great for "people who bought X also bought Y" but fail on cold- start: a brand-new SKU has no purchase history yet. Embedding-based similarity solves cold-start by recommending products whose descriptions mean the same thing — even if no one has ever bought them together. The wow moment: one Cypher pattern that reads (:Product {name: "..."})-[:SIMILAR_TO > 0.85]->(other) returns the right answers, no separate vector-DB call.

Step 1: Set up a small catalog with embeddings

// Embedding values are 5-dim demo vectors; in production these come from EMBED('description').
MERGE (h1:Product {sku: "USB-301", name: "USB-C 4-port hub",
                   description: "Compact aluminum USB-C hub with 4 data ports for laptops",
                   price: 39.99,  embedding: [0.81,  0.04, 0.12, -0.05, 0.21]})
MERGE (h2:Product {sku: "USB-307", name: "USB-C charging hub",
                   description: "USB-C hub with 3 data ports plus 65W passthrough charging",
                   price: 54.99,  embedding: [0.79,  0.06, 0.10, -0.04, 0.23]})
MERGE (h3:Product {sku: "USB-410", name: "Thunderbolt 4 dock",
                   description: "Thunderbolt 4 docking station with dual 4K display output",
                   price: 249.0,  embedding: [0.77,  0.02, 0.15, -0.07, 0.25]})

MERGE (k1:Product {sku: "KBD-100", name: "Mechanical keyboard 75%",
                   description: "Hot-swap mechanical keyboard, 75% layout, RGB backlight",
                   price: 129.0,  embedding: [-0.34, 0.58, 0.21, 0.04,  0.12]})
MERGE (k2:Product {sku: "KBD-105", name: "Wireless mechanical keyboard",
                   description: "Bluetooth mechanical keyboard with hot-swap switches",
                   price: 159.0,  embedding: [-0.32, 0.60, 0.19, 0.06,  0.11]})

MERGE (m1:Product {sku: "MIC-220", name: "USB condenser microphone",
                   description: "Cardioid USB condenser mic for podcasting and streaming",
                   price: 99.0,   embedding: [0.07, -0.41, 0.66, 0.10,  -0.18]})
MERGE (m2:Product {sku: "MIC-225", name: "XLR studio microphone",
                   description: "Large-diaphragm XLR studio condenser microphone",
                   price: 219.0,  embedding: [0.09, -0.43, 0.64, 0.08,  -0.16]})

MERGE (b1:Product {sku: "BAG-501", name: "Tech messenger bag",
                   description: "Padded laptop messenger bag with cable organizer",
                   price: 79.0,   embedding: [-0.10, 0.18, -0.55, 0.42,  0.30]});

Step 2: Recommend semantically similar products

// "A shopper viewed the Mechanical keyboard 75%. What else would they like?"
MATCH (seed:Product {sku: "KBD-100"})-[:SIMILAR_TO > 0.85]->(rec:Product)
RETURN rec.sku   AS recommend_sku,
       rec.name  AS recommend_name,
       rec.price AS price;

What's happening

[:SIMILAR_TO > 0.85] is a synthetic Cypher edge backed by the HNSW vector index over the embedding property. Cypher engines that do not have this need an out-of-band vector lookup followed by a second SQL/JOIN — two systems, two round trips.
The threshold operator (>, >=, <, <=, =) lets you tune precision/recall directly in the query. > 0.85 is a strict "near neighbor"; > 0.7 casts a wider net.
The seed node never appears in its own neighbor set — the engine respects the cycle-handling rule (SkipRevisits), so the keyboard does not recommend itself.
Cold-start friendly: the new SKU KBD-105 has zero purchase history but its embedding is close to KBD-100, so it is recommended on day one.
This is the same primitive that powers GraphRAG, semantic dedup, and look-alike audiences. Master one query, reuse the pattern across personalisation, search, and matching.

Try this next

// Mix structural and semantic: same category AND semantically close
MATCH (seed:Product {sku: "MIC-220"})-[:SIMILAR_TO > 0.8]->(rec:Product)
WHERE rec.price < seed.price * 1.5
RETURN rec.sku, rec.name, rec.price;

// Loose threshold returns more, near-duplicates first.
MATCH (seed:Product {sku: "USB-301"})-[:SIMILAR_TO > 0.7]->(rec:Product)
RETURN rec.sku, rec.name;

// "Find products NOT similar to anything else in catalog" — outliers worth featuring.
MATCH (p:Product)
OPTIONAL MATCH (p)-[:SIMILAR_TO > 0.8]->(near:Product)
WITH p, count(near) AS nearby
WHERE nearby = 0
RETURN p.sku, p.name AS unique_item;

Semantic product recommendations with SIMILAR_TO

Semantic product recommendations with SIMILAR_TO

Objective

Step 1: Set up a small catalog with embeddings

Step 2: Recommend semantically similar products

What's happening

Try this next

Run this on your own machine