DAG, find critical build dependencies

Topologically order a software dependency graph and find the longest critical path

All recipes· graph· 7 minutesintermediatecypher

DAG, find critical build dependencies

Objective

Every CI system needs to know "if I touch package A, what gets rebuilt and in what order, and where is the longest path that determines build wall-clock time?" The build graph is a DAG; the longest path through it is the critical path. Cypher computes both directly with no pre-baked topology table.

Step 1: Create the graph

// Build packages — durations in seconds
MERGE (core:Package    {name: "core",    build_secs: 30})
MERGE (utils:Package   {name: "utils",   build_secs: 12})
MERGE (auth:Package    {name: "auth",    build_secs: 45})
MERGE (db:Package      {name: "db",      build_secs: 60})
MERGE (api:Package     {name: "api",     build_secs: 90})
MERGE (cli:Package     {name: "cli",     build_secs: 25})
MERGE (worker:Package  {name: "worker",  build_secs: 50})
MERGE (web:Package     {name: "web",     build_secs: 120})
MERGE (mobile:Package  {name: "mobile",  build_secs: 110})
MERGE (gateway:Package {name: "gateway", build_secs: 80})

// Edge X -> Y means "X depends on Y" (Y must build first)
MERGE (utils)-[:DEPENDS_ON]->(core)
MERGE (auth)-[:DEPENDS_ON]->(core)
MERGE (auth)-[:DEPENDS_ON]->(utils)
MERGE (db)-[:DEPENDS_ON]->(core)
MERGE (api)-[:DEPENDS_ON]->(auth)
MERGE (api)-[:DEPENDS_ON]->(db)
MERGE (cli)-[:DEPENDS_ON]->(api)
MERGE (worker)-[:DEPENDS_ON]->(db)
MERGE (worker)-[:DEPENDS_ON]->(auth)
MERGE (web)-[:DEPENDS_ON]->(api)
MERGE (mobile)-[:DEPENDS_ON]->(api)
MERGE (gateway)-[:DEPENDS_ON]->(api)
MERGE (gateway)-[:DEPENDS_ON]->(worker);

Step 2: Find the critical path

// A 'leaf' is a package nothing depends on (top of the build graph).
// A 'root' is a package with no dependencies (bottom).
// The critical path is the leaf->root path with the largest total build_secs.
MATCH (leaf:Package), (root:Package)
WHERE NOT ()-[:DEPENDS_ON]->(leaf)
  AND NOT (root)-[:DEPENDS_ON]->()
MATCH path = (leaf)-[:DEPENDS_ON*1..10]->(root)
WITH path,
     reduce(s = 0, n IN nodes(path) | s + n.build_secs) AS total_secs
RETURN [n IN nodes(path) | n.name] AS critical_path,
       length(path) + 1            AS step_count,
       total_secs
ORDER BY total_secs DESC
LIMIT 1;

What's happening

  • We anchor at "leaves" (no incoming DEPENDS_ON) and "roots" (no outgoing) — the start and end of any build chain — using anti-patterns in WHERE.
  • reduce() walks the path's nodes summing each package's build duration to score it.
  • ORDER BY total_secs DESC LIMIT 1 returns the single critical path; bumping the LIMIT shows the next-worst paths, which is what you'd parallelize first.
  • In SQL this requires a recursive CTE plus a window function over path lengths — easily 30 lines.
  • The same DAG model works for Airflow tasks, Bazel targets, npm install order, or service startup graphs — anywhere you need topo-sort + longest path.

Try this next

MATCH (p:Package)<-[:DEPENDS_ON*]-(downstream:Package)
RETURN p.name AS package, count(DISTINCT downstream) AS impact_radius
ORDER BY impact_radius DESC;
MATCH (p:Package)-[:DEPENDS_ON*]->(dep:Package)
WHERE p.name = "web"
RETURN DISTINCT dep.name, dep.build_secs
ORDER BY dep.build_secs DESC;
MATCH (p:Package)
WHERE NOT (p)-[:DEPENDS_ON]->()
RETURN p.name AS root_packages;

Tags

graphcypherdevopsintermediate

Run this on your own machine

Install SynapCores Community Edition free, paste the SQL or Cypher above into the bundled web UI, and watch it run.

Download Free CE