Citation network, most influential paper
Objective
Ranking papers by raw citation count is shallow — being cited by influential papers should count more than being cited by obscure ones. We approximate PageRank-style influence with a two-hop weighted score on a citation graph, surfacing the seminal works in a research field.
Step 1: Create the graph
// Papers in the deep-learning lineage
MERGE (alex:Paper {title: "AlexNet (2012)", year: 2012, field: "Deep Learning"})
MERGE (vgg:Paper {title: "VGG Networks (2014)", year: 2014, field: "Deep Learning"})
MERGE (resnet:Paper {title: "ResNet (2015)", year: 2015, field: "Deep Learning"})
MERGE (incept:Paper {title: "GoogLeNet (2014)", year: 2014, field: "Deep Learning"})
MERGE (batch:Paper {title: "Batch Normalization (2015)", year: 2015, field: "Deep Learning"})
MERGE (drop:Paper {title: "Dropout (2014)", year: 2014, field: "Deep Learning"})
MERGE (att:Paper {title: "Attention is All You Need (2017)", year: 2017, field: "NLP"})
MERGE (bert:Paper {title: "BERT (2018)", year: 2018, field: "NLP"})
MERGE (gpt:Paper {title: "GPT-3 (2020)", year: 2020, field: "NLP"})
MERGE (dall:Paper {title: "DALL-E (2021)", year: 2021, field: "Multimodal"})
MERGE (clip:Paper {title: "CLIP (2021)", year: 2021, field: "Multimodal"})
// Citation edges (citer)-[:CITES]->(cited)
MERGE (vgg)-[:CITES]->(alex)
MERGE (resnet)-[:CITES]->(alex)
MERGE (resnet)-[:CITES]->(vgg)
MERGE (resnet)-[:CITES]->(batch)
MERGE (incept)-[:CITES]->(alex)
MERGE (batch)-[:CITES]->(alex)
MERGE (drop)-[:CITES]->(alex)
MERGE (att)-[:CITES]->(resnet)
MERGE (att)-[:CITES]->(drop)
MERGE (bert)-[:CITES]->(att)
MERGE (bert)-[:CITES]->(drop)
MERGE (gpt)-[:CITES]->(att)
MERGE (gpt)-[:CITES]->(bert)
MERGE (clip)-[:CITES]->(att)
MERGE (clip)-[:CITES]->(resnet)
MERGE (clip)-[:CITES]->(bert)
MERGE (dall)-[:CITES]->(att)
MERGE (dall)-[:CITES]->(clip);
Step 2: Rank papers by influence
// Influence = direct citations + 0.5 * second-order citations.
MATCH (p:Paper)<-[:CITES]-(citer:Paper)
WITH p, count(citer) AS direct_citations,
collect(citer) AS citers
OPTIONAL MATCH (mid:Paper)<-[:CITES]-(grand:Paper)
WHERE mid IN citers
WITH p, direct_citations,
count(grand) AS second_order
RETURN p.title AS paper,
p.year AS year,
direct_citations,
second_order,
direct_citations + 0.5 * second_order AS influence_score
ORDER BY influence_score DESC
LIMIT 5;
What's happening
- The first MATCH counts direct citations and collects the citing papers.
- The OPTIONAL MATCH walks one more hop backward to count "papers that cite my citers" — a cheap, transparent approximation of PageRank's iterative idea.
- A weighted sum (
+ 0.5 * second_order) blends both signals; tweak the weight or extend to three hops to taste. - This shows graphs as live analytics, not just storage: rerun whenever new citations land and the rankings update — no batch job needed.
- In SQL you would unroll citations into self-joins; in Cypher each hop is one relationship walk.
Try this next
MATCH (p:Paper {field: "NLP"})
OPTIONAL MATCH (p)<-[:CITES]-(citer)
RETURN p.title, count(citer) AS citations
ORDER BY citations DESC;
MATCH (a:Paper)-[:CITES*1..3]->(seminal:Paper {title: "AlexNet (2012)"})
RETURN DISTINCT a.title AS descendants, a.year;
MATCH path = (recent:Paper {title: "DALL-E (2021)"})-[:CITES*]->(root:Paper)
WHERE NOT (root)-[:CITES]->()
RETURN [n IN nodes(path) | n.title] AS lineage, length(path) AS depth
ORDER BY depth DESC LIMIT 3;