Movie recommendation by shared genre
Objective
Power a "because you watched X" recommender entirely from the database. The pattern — user → movie → genre → other movie — is awkward in SQL because it requires an aggregating join across three tables. In Cypher, it's the natural shape of the data and runs in milliseconds.
Step 1: Create the graph
// Genres
MERGE (sci:Genre {name: "Sci-Fi"})
MERGE (thr:Genre {name: "Thriller"})
MERGE (drm:Genre {name: "Drama"})
MERGE (act:Genre {name: "Action"})
// Movies
MERGE (m1:Movie {title: "Arrival", year: 2016})
MERGE (m2:Movie {title: "Inception", year: 2010})
MERGE (m3:Movie {title: "Interstellar", year: 2014})
MERGE (m4:Movie {title: "Blade Runner 2049", year: 2017})
MERGE (m5:Movie {title: "The Social Network", year: 2010})
MERGE (m6:Movie {title: "Mad Max: Fury Road", year: 2015})
MERGE (m7:Movie {title: "Dune", year: 2021})
MERGE (m1)-[:HAS_GENRE]->(sci)
MERGE (m1)-[:HAS_GENRE]->(drm)
MERGE (m2)-[:HAS_GENRE]->(sci)
MERGE (m2)-[:HAS_GENRE]->(thr)
MERGE (m3)-[:HAS_GENRE]->(sci)
MERGE (m3)-[:HAS_GENRE]->(drm)
MERGE (m4)-[:HAS_GENRE]->(sci)
MERGE (m4)-[:HAS_GENRE]->(thr)
MERGE (m5)-[:HAS_GENRE]->(drm)
MERGE (m6)-[:HAS_GENRE]->(act)
MERGE (m6)-[:HAS_GENRE]->(sci)
MERGE (m7)-[:HAS_GENRE]->(sci)
MERGE (m7)-[:HAS_GENRE]->(act)
// Users + viewing history
MERGE (sarah:User {name: "Sarah Chen"})
MERGE (sarah)-[:WATCHED {rating: 5}]->(m1)
MERGE (sarah)-[:WATCHED {rating: 4}]->(m3)
MERGE (raj:User {name: "Raj Patel"})
MERGE (raj)-[:WATCHED {rating: 5}]->(m2)
MERGE (raj)-[:WATCHED {rating: 5}]->(m6);
Step 2: Recommend movies for Sarah
// For each genre Sarah's watched movies belong to, find other movies in those
// genres she has NOT watched, ranked by genre overlap.
MATCH (sarah:User {name: "Sarah Chen"})-[:WATCHED]->(seen:Movie)-[:HAS_GENRE]->(g:Genre)
MATCH (g)<-[:HAS_GENRE]-(rec:Movie)
WHERE NOT (sarah)-[:WATCHED]->(rec)
RETURN rec.title AS recommendation,
rec.year AS year,
count(DISTINCT g) AS genre_overlap
ORDER BY genre_overlap DESC, recommendation;
What's happening
- Two consecutive pattern matches express "movies in genres of movies Sarah liked" as a literal graph walk — no GROUP BY gymnastics.
- The anti-pattern
NOT (sarah)-[:WATCHED]->(rec)removes already-seen movies cleanly. - Ranking by
count(DISTINCT g)rewards movies that share multiple genres with her history, approximating taste fit without any ML. - In SQL this needs a 3-way join (
watched JOIN movie_genres JOIN movie_genres) plus a NOT EXISTS; the Cypher version reads like the business question and is index-friendly per relationship type.
Try this next
MATCH (u:User)-[w:WATCHED]->(m:Movie)
RETURN u.name AS user, avg(w.rating) AS avg_rating;
MATCH (a:User)-[:WATCHED]->(m:Movie)<-[:WATCHED]-(b:User)
WHERE a.name < b.name
RETURN a.name, b.name, count(m) AS shared_movies
ORDER BY shared_movies DESC;
MATCH (g:Genre)<-[:HAS_GENRE]-(m:Movie)
RETURN g.name AS genre, count(m) AS catalog_size
ORDER BY catalog_size DESC;