Insurance Claim Fraud Auto-Flag (AutoML)

Train a binary classifier on past claims to auto-flag suspect submissions — turn the legacy red-flag-rules engine into a learning model.

All recipes· anomaly-detection· 5 minutesintermediatesql
Instance: localhost:8080

Opens your running SynapCores (Insurance Claim Fraud Auto-Flag (AutoML) will be staged for a preview — nothing runs until you click Run). No instance yet? Install free in ~30s.

Share

Insurance Claim Fraud Auto-Flag (AutoML)

Tested against SynapCores CE v1.7.0.1-ce (the currently-shipped release on Docker Hub: synapcores/community:v1.7.0.1-ce).

Objective

Auto-flag suspect claims at intake. The model learns the joint shape of claim amount, days between policy start and claim, prior claim count, and document completeness — the four signals every Special Investigations Unit (SIU) actually uses.

Why this matters: the Coalition Against Insurance Fraud puts US carrier losses at $308B/year. Carriers run hand-coded red-flag rules; new fraud rings rotate around them in weeks. A trained classifier updates with one CREATE EXPERIMENT call.

Step 1 — Schema + labelled claims

160 claims: 144 paid-as-normal + 16 SIU-confirmed fraud.

DROP TABLE IF EXISTS ins_claim;
CREATE TABLE ins_claim (
    id                  INTEGER PRIMARY KEY,
    claim_amt           DOUBLE,
    days_after_policy   INTEGER,
    prior_claims        INTEGER,
    doc_count           INTEGER,
    is_suspect          INTEGER
);

INSERT INTO ins_claim VALUES
(3,718.55,1394,1,9,0),
(11,1603.67,745,1,4,0),
(25,329.66,830,1,4,0),
(5,1573.64,1291,0,7,0),
(153,26544.45,22,5,2,1),
(126,2478.28,905,2,5,0),
(100,1503.96,550,2,8,0),
(105,3935.43,1662,0,4,0),
(48,1844.45,1596,2,8,0),
(51,878.51,1382,1,3,0),
(107,2663.33,1292,0,7,0),
(143,3349.76,812,1,3,0),
(90,3645.96,1093,1,3,0),
(7,2681.72,103,2,5,0),
(70,1580.48,1629,0,3,0),
(117,4481.81,747,0,6,0),
(116,1321.97,78,0,6,0),
(2,2836.27,975,0,8,0),
(137,4497.74,284,0,6,0),
(148,52938.38,17,7,0,1),
(80,1482.88,108,1,3,0),
(95,1097.34,1468,0,3,0),
(19,2674.79,1209,2,7,0),
(58,3222.47,1305,2,3,0),
(43,3435.89,1249,1,7,0),
(158,21981.9,10,9,1,1),
(60,2983.31,631,2,5,0),
(113,667.48,1712,2,7,0),
(74,1503.5,700,1,7,0),
(33,4462.52,1583,1,6,0),
(136,672.15,1211,0,5,0),
(109,3670.88,312,2,8,0),
(138,3712.27,706,1,9,0),
(27,2107.41,188,2,4,0),
(88,2112.43,586,0,9,0),
(26,1611.64,791,2,4,0),
(112,2801.21,690,2,9,0),
(97,3546.62,504,2,6,0),
(57,1671.85,1734,2,3,0),
(52,328.91,384,2,5,0),
(39,3209.76,423,0,3,0),
(79,1225.45,177,0,5,0),
(98,3602.55,607,2,4,0),
(92,4132.39,1253,1,6,0),
(81,643.63,971,1,4,0),
(20,2063.93,1438,2,6,0),
(69,2598.05,1285,0,7,0),
(120,2248.5,419,2,3,0),
(111,4351.07,342,0,7,0),
(104,3635.99,930,1,9,0),
(140,283.41,292,0,5,0),
(55,2917.56,987,2,4,0),
(115,639.29,1094,2,8,0),
(160,30063.86,5,6,1,1),
(152,31019.81,12,7,0,1),
(128,1194.25,1704,1,5,0),
(63,3195.65,1733,0,7,0),
(130,1375.76,1545,1,6,0),
(127,4295.19,1630,2,3,0),
(149,19432.71,5,9,1,1),
(73,486.42,425,1,7,0),
(145,42054.6,5,4,1,1),
(86,2534.3,250,0,9,0),
(56,2105.13,940,0,5,0),
(110,1179.35,911,1,4,0),
(72,2919.03,193,0,7,0),
(151,49188.61,8,4,2,1),
(21,578.87,968,1,3,0),
(99,288.28,390,2,9,0),
(91,3101.45,482,1,4,0),
(23,4294.27,1108,0,6,0),
(71,4278.82,1465,0,9,0),
(53,381.7,1056,2,3,0),
(37,1706.07,191,1,3,0),
(134,2024.77,798,2,3,0),
(139,981.78,545,0,3,0),
(124,972.68,1365,0,7,0),
(10,4427.08,1219,0,8,0),
(135,3876.04,856,0,5,0),
(84,1209.58,525,2,5,0),
(44,1093.17,764,0,8,0),
(85,4085.32,1010,1,7,0),
(16,2377.0,1119,1,6,0),
(101,1053.06,1683,1,8,0),
(142,1404.73,1304,2,5,0),
(122,2110.67,983,1,8,0),
(96,695.97,1787,1,9,0),
(93,2911.87,159,1,8,0),
(8,1535.93,449,0,9,0),
(38,936.93,1477,2,6,0),
(102,1911.71,1200,2,4,0),
(159,49657.86,8,5,2,1),
(123,3572.22,719,0,9,0),
(32,3242.83,337,2,5,0),
(28,3622.4,1385,2,4,0),
(154,45441.4,14,4,1,1),
(144,2599.56,514,0,4,0),
(78,4296.09,1450,1,3,0),
(64,1019.14,890,1,9,0),
(141,2013.87,1383,1,8,0),
(47,1082.29,1603,1,8,0),
(157,38231.43,23,6,2,1),
(118,4245.07,1100,2,4,0),
(66,3019.81,459,2,4,0),
(54,1391.42,1633,0,4,0),
(46,1222.11,1782,2,3,0),
(14,3175.12,944,0,5,0),
(50,683.99,193,2,7,0),
(49,2003.34,579,0,4,0),
(67,568.44,837,1,4,0),
(103,3536.24,1427,1,7,0),
(40,4358.68,781,2,3,0),
(131,2398.73,121,2,5,0),
(4,4025.07,534,1,3,0),
(121,3188.8,1429,1,8,0),
(75,2876.38,1575,2,5,0),
(9,679.74,1147,2,9,0),
(65,3964.58,218,1,8,0),
(147,64260.96,30,7,2,1),
(89,856.91,1205,1,5,0),
(87,1227.72,695,2,4,0),
(42,3924.64,429,0,5,0),
(77,4076.15,1700,2,8,0),
(36,4253.73,97,0,7,0),
(17,3042.3,438,1,7,0),
(146,60999.18,12,9,0,1),
(30,1966.21,1755,0,7,0),
(94,2173.43,1433,1,5,0),
(129,381.3,763,2,7,0),
(61,1009.89,628,2,4,0),
(114,3270.14,537,1,4,0),
(15,2486.81,126,1,5,0),
(22,2056.57,884,0,9,0),
(41,4470.47,1154,1,6,0),
(82,887.21,642,1,6,0),
(12,4313.88,699,2,7,0),
(68,433.58,1134,0,4,0),
(83,1722.42,1725,1,5,0),
(45,4105.13,1134,0,3,0),
(62,1695.99,1686,2,6,0),
(132,2513.56,383,1,9,0),
(35,4305.42,286,2,5,0),
(133,4154.32,985,2,6,0),
(106,2966.7,613,2,5,0),
(119,3236.08,595,1,3,0),
(155,25759.45,5,4,1,1),
(6,3146.62,940,2,3,0),
(1,4320.3,347,0,3,0),
(24,229.06,999,0,4,0),
(108,2581.83,786,1,9,0),
(31,2180.48,1793,0,6,0),
(29,3018.98,266,0,4,0),
(76,1585.54,1444,2,3,0),
(156,73299.69,30,9,2,1),
(125,729.53,203,0,9,0),
(150,20532.13,1,7,0,1),
(13,811.56,1750,1,8,0),
(59,954.58,1703,2,7,0),
(18,1807.25,1253,0,5,0),
(34,3327.19,1377,2,5,0)
;

SELECT COUNT(*) AS total, SUM(is_suspect) AS confirmed_fraud FROM ins_claim;
-- → 160 rows, 16 confirmed fraud

Step 2 — Train + deploy

CREATE EXPERIMENT ins_clf AS
SELECT claim_amt, days_after_policy, prior_claims, doc_count, is_suspect AS target
FROM ins_claim
WITH (
    task_type = 'binary_classification',
    target_column = 'target',
    optimization_metric = 'auc',
    max_trials = 8,
    time_budget_seconds = 120,
    algorithms = ['logistic_regression', 'random_forest', 'gradient_boosting'],
    validation_strategy = 'kfold',
    n_folds = 3,
    feature_engineering = false,
    hyperparameter_strategy = 'random'
);

DEPLOY MODEL ins_flagger FROM EXPERIMENT ins_clf;
-- best_score = 1.0

Step 3 — Score new claims

-- A modest claim 540 days into a clean policy with full documentation
SELECT AUTOML.PREDICT('ins_flagger', 1200.0, 540, 1, 6) AS risk;
-- → 0.056  (PAY)

-- $42K claim 8 days after policy start, 6 prior claims, 1 supporting doc
SELECT AUTOML.PREDICT('ins_flagger', 42000.0, 8, 6, 1) AS risk;
-- → 0.944  (SIU REVIEW)

Step 4 — Sweep the queue

SELECT id, claim_amt, days_after_policy, prior_claims, doc_count,
       AUTOML.PREDICT('ins_flagger', claim_amt, days_after_policy, prior_claims, doc_count) AS risk
FROM ins_claim
ORDER BY risk DESC
LIMIT 10;

Productionizing

Wire risk > 0.7 → SIU queue, retrain monthly on rolling 12-month labelled window. Pair with Recipe 9 (narrative semantic fraud) for two-signal detection — structured features + free-text narrative drift.


Get SynapCores Community Edition →

Tags

anomaly-detectionautomlinsurancefraudclassification

Run this on your own machine

Install SynapCores Community Edition free, paste the SQL or Cypher above into the bundled web UI, and watch it run.

Download Free CE