Support Ticket Semantic Anomaly Detection
Tested against SynapCores CE v1.7.0.1-ce (the currently-shipped release on Docker Hub:
synapcores/community:v1.7.0.1-ce).
Objective
Most support tickets cluster around a handful of high-volume topics (login issues, billing, password reset). The useful ones are the outliers — the procurement RFP, the P0 security disclosure, the high-value churn-risk customer. Embed the "common cluster" centroid and rank every ticket by distance from it.
Why this matters: tier-1 ticket triage misroutes high-value tickets because their templates only recognize the common cluster. A single SQL query gives you the inverse — the tickets that don't fit the common shapes.
Step 1 — Schema + recent tickets
7 routine tickets (4 login + 3 billing) + 2 outliers (enterprise SOC2, P0 security).
DROP TABLE IF EXISTS support_tix;
CREATE TABLE support_tix (
id INTEGER PRIMARY KEY,
body TEXT
);
INSERT INTO support_tix VALUES
(1,'Cannot log in keeps showing invalid password tried reset link no email arrived'),
(2,'Login broken says invalid credentials password reset email never received'),
(3,'Password reset link not arriving in my inbox have checked spam folder'),
(4,'Account access issue invalid password error even after using reset flow'),
(5,'My subscription renewed twice this month please refund the duplicate charge'),
(6,'Double charged for monthly plan need refund for the extra charge on credit card'),
(7,'Billing showed two charges for same month requesting refund of duplicate'),
(8,'Hi I would like to know if the platform supports row level security and SOC2 compliance for our enterprise procurement review'),
(9,'CRITICAL SECURITY VULNERABILITY production data leak our customer PII is exposed through your public API please respond immediately this is a P0');
Step 2 — Define the "common cluster" centroid
-- A short summary of the high-volume topics — your centroid prompt
SELECT EMBED('cannot log in password reset email not arriving billing duplicate charge refund');
Step 3 — Rank every ticket by distance from the centroid
SELECT id, body,
COSINE_SIMILARITY(
EMBED(body),
EMBED('cannot log in password reset email not arriving billing duplicate charge refund')
) AS sim_to_common
FROM support_tix
ORDER BY sim_to_common ASC -- lowest similarity first = most anomalous
LIMIT 5;
-- → id=8 (SOC2 procurement) sim = 0.008 ← top outlier
-- → id=9 (P0 security) sim = 0.142 ← second outlier
-- → routine tickets: sim = 0.52–0.62
Step 4 — Auto-route the outliers
-- Route anything below similarity 0.30 to senior triage
SELECT id, body,
COSINE_SIMILARITY(
EMBED(body),
EMBED('cannot log in password reset email not arriving billing duplicate charge refund')
) AS sim
FROM support_tix
WHERE COSINE_SIMILARITY(
EMBED(body),
EMBED('cannot log in password reset email not arriving billing duplicate charge refund')
) < 0.30
ORDER BY sim ASC;
-- → both anomalies surface; nothing else does.
Productionizing
Refresh the centroid weekly from your top-3 topic clusters (group
tickets, embed each cluster's representative, average the vectors).
Pair with a keyword tripwire (UPPER(body) LIKE '%CRITICAL%' OR LIKE '%SOC2%') so a low embedding similarity AND a tripwire keyword
becomes an instant P0 page.