Contract clause graph with LLM-judged risk

Objective

Legal teams spend the most time on the same task across deals: who promised what, by when, with what penalty. Extract that structure once into a graph and every downstream review becomes a Cypher query. We use /v2/graph/extract to lift the obligations out of one paragraph of an MSA, then LLM_SCORE to grade each obligation's risk in-place. The wow moment: a Cypher query returns "every uncapped indemnity in deals signed this quarter" without any LLM round-tripping in your application code.

Step 1: Extract clauses from a contract paragraph

curl -X POST https://localhost:8443/v2/graph/extract \
  -H "Authorization: Bearer $AIDB_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Under the Master Services Agreement dated March 14, 2026, Helia Energy Inc. agrees to deliver 12 megawatts of renewable diesel capacity to Northwind Logistics by August 1, 2026. Helia indemnifies Northwind for any third-party claim arising from emissions non-compliance, with no cap on liability. Northwind shall pay $4.2M upon delivery and an additional $0.8M upon Phase 2 commissioning by December 15, 2026. Either party may terminate with 90 days notice; early termination by Northwind triggers a $1.6M kill fee.",
    "default_node_label": "ContractEntity",
    "node_provenance": {"contract": "MSA-2026-003", "deal": "Helia-Northwind"},
    "edge_provenance": {"contract": "MSA-2026-003"},
    "min_confidence": 0.55
  }'

The extractor will create nodes like Helia Energy Inc., Northwind Logistics, MSA, Phase 2, and edges such as DELIVERS, INDEMNIFIES, PAYS, TERMINATES_WITH_NOTICE, KILL_FEE_OF.

Step 2: Pre-seed the obligation nodes for the demo

The extractor's exact predicate names depend on the LLM run. So you can run the recipe end-to-end without a live LLM, here are the same obligations as deterministic Cypher you can MERGE today:

MERGE (helia:Party    {name: "Helia Energy Inc.",   contract: "MSA-2026-003"})
MERGE (north:Party    {name: "Northwind Logistics", contract: "MSA-2026-003"})

MERGE (deliver:Obligation {id: "OBL-1",
       text: "Deliver 12 MW renewable diesel capacity by Aug 1 2026",
       due:  "2026-08-01",  category: "delivery",  capped: true})
MERGE (indem:Obligation   {id: "OBL-2",
       text: "Indemnify Northwind for emissions non-compliance, no cap on liability",
       due:  null,          category: "indemnity", capped: false})
MERGE (pay1:Obligation    {id: "OBL-3",
       text: "Pay $4.2M to Helia on delivery",
       due:  "2026-08-01",  category: "payment",   capped: true})
MERGE (pay2:Obligation    {id: "OBL-4",
       text: "Pay $0.8M on Phase 2 commissioning",
       due:  "2026-12-15",  category: "payment",   capped: true})
MERGE (kill:Obligation    {id: "OBL-5",
       text: "Pay $1.6M kill fee if Northwind terminates early",
       due:  null,          category: "termination", capped: true})

MERGE (helia)-[:OWES {contract: "MSA-2026-003"}]->(deliver)
MERGE (helia)-[:OWES {contract: "MSA-2026-003"}]->(indem)
MERGE (north)-[:OWES {contract: "MSA-2026-003"}]->(pay1)
MERGE (north)-[:OWES {contract: "MSA-2026-003"}]->(pay2)
MERGE (north)-[:OWES {contract: "MSA-2026-003"}]->(kill);

Step 3: LLM-grade each obligation for legal risk

// "Score each obligation 0..1 for legal risk and surface the top concerns."
MATCH (party:Party)-[:OWES]->(o:Obligation)
WITH party, o,
     llm_score(
       "On a 0-to-1 scale, rate this contract obligation's risk to the obligor. " +
       "Uncapped liability, vague due dates, and asymmetric kill fees should score high.",
       o
     ) AS risk
WHERE risk > 0.6
RETURN party.name AS owed_by,
       o.id       AS obligation,
       o.text     AS clause,
       o.category AS category,
       risk
ORDER BY risk DESC;

What's happening

/v2/graph/extract turns prose into nodes and edges in one call. Provenance properties (contract, deal) get stamped onto every fact so you can filter by document later.
LLM_SCORE(prompt, n) runs the LLM as a scalar function inside the query. Each obligation gets graded in the same engine — no separate microservice, no hand-rolled batching.
The parser tolerates flaky model output: "Score: 0.85", "85%", "0.85 because ...", **0.85** all parse cleanly to 0.85 (see crates/aidb-query/tests/graph_llm_score_test.rs). Out-of-range values are clamped, not rejected — robust by design.
A Cypher pattern like (party:Party)-[:OWES]->(o:Obligation) is simpler to reason about than a JOIN-heavy SQL query over parties × obligations × deals × clause_types.
Adding a clause type (e.g. non_compete, right_of_first_refusal) requires no schema change — the property bag accepts new fields, and the LLM_SCORE prompt can change to grade them.

Try this next

MATCH (p:Party)-[:OWES]->(o:Obligation)
WHERE o.capped = false
RETURN p.name, o.id, o.text;

MATCH (p:Party)-[:OWES]->(o:Obligation)
WHERE o.due IS NOT NULL AND o.due < "2026-09-01"
RETURN p.name AS owed_by, o.text AS due_soon, o.due
ORDER BY o.due;

MATCH (p:Party)-[:OWES]->(o:Obligation)
WITH p, count(o) AS total,
     llm_score("rate aggregate contract risk for this party 0..1", p) AS portfolio_risk
RETURN p.name, total, portfolio_risk
ORDER BY portfolio_risk DESC;

Contract clause graph with LLM-judged risk

Contract clause graph with LLM-judged risk

Objective

Step 1: Extract clauses from a contract paragraph

Step 2: Pre-seed the obligation nodes for the demo

Step 3: LLM-grade each obligation for legal risk

What's happening

Try this next

Run this on your own machine