In-Database ML vs External ML Pipelines: Move the Model, Not the Data

Published on August 5, 2026

One engine: vector, graph, SQL, AutoML and LLM in a single SynapCores binary

There are two ways to get a machine-learning prediction next to your data. The familiar one is to move the data to the model: extract it from the database, ship it to a Python service or a model-serving platform, run inference, and write the results back. The other is to move the model to the data: train and run it inside the database, where the rows already live. For decades the first approach was the only practical option, so it became the default and then the assumption. This post argues that for a large and growing class of workloads the assumption is now backwards — and walks through exactly where each approach wins, because there are real cases for both.

For the conceptual background, see What Is In-Database Machine Learning? and the AutoML guide.

The in-database engine here is SynapCores — vector, graph, SQL, and in-database AutoML in a single self-hosted binary, with native MCP and an OpenClaw long-term-memory plugin. The Community Edition is free. Download the free Community Edition →

A note on scope. Capabilities marked (Enterprise / roadmap) are part of the Enterprise tier or roadmap and are not in the free Community Edition today. Everything unmarked — including in-database AutoML and inference — is in the free CE.

The external pipeline, and what it really costs

The external pattern is well understood and well tooled. An orchestration layer extracts a training set, a training job produces a model, a registry versions it, a serving layer hosts it, and an inference path moves features out of the database and predictions back in. For large, custom deep-learning models trained on specialized hardware, this is the right architecture and nothing here disputes that.

The cost that gets undercounted is everything between the boxes. Each prediction in production means pulling features out of the database, serializing them, crossing the network to the serving layer, deserializing, inferring, and writing the result back. That is latency on every call and a second system to scale. Worse is the subtler failure: the features computed in the pipeline drift from the features in the live database, and the model that looked great in the training notebook quietly underperforms in production because it is being fed slightly different inputs. Training/serving skew is one of the most common and hardest-to-diagnose ML production bugs, and it is a direct consequence of the data and the model living in different places.

In-database ML: the model goes to the data

The in-database approach inverts the data flow. The model trains and runs where the rows are, so inference is a function call inside a query rather than a network round-trip to a service.

-- Train where the data lives
CREATE EXPERIMENT churn_model
  PREDICT churned
  FROM customers
  USING AUTOML;

-- Predict inside an ordinary query — no extraction, no serving layer
SELECT customer_id, plan,
       PREDICT(churned) AS churn_risk
FROM customers
WHERE plan = 'pro'
ORDER BY churn_risk DESC
LIMIT 50;

Three things change. There is no extraction and no serialization, so the latency of moving data disappears. There is no separate serving system to deploy and scale, so the operational surface shrinks. And because training and inference read the same live tables, training/serving skew largely goes away — the model is fed the exact features the database holds, by construction.

Where each approach wins

This is not a case where one approach dominates. The honest division:

Factor External pipeline In-database ML
Latency per prediction Network round-trip each call Function call in the query
Training/serving skew Common, hard to diagnose Minimized by construction
Data movement Extract + write back None
Operational surface Orchestration + serving + registry The database
Very large custom deep nets Strong fit (specialized hardware) Not the target
Tabular models, embeddings, scoring Workable but heavy Strong fit
Predictions inside SQL/analytics Awkward, round-trips Native
Team skill alignment ML engineers SQL-literate engineers

Reach for the external pipeline when the model is large and custom — big transformers, computer-vision nets, anything that needs specialized accelerators and a dedicated training regimen — or when a central ML platform team owns models as products across many consumers. That is what those systems are built for.

Reach for in-database ML when the model is the kind that powers application features — churn and conversion scoring, recommendations, classification, anomaly detection, embeddings for search — and especially when predictions need to appear inside ordinary queries at low latency. For these, the external pipeline is a lot of moving infrastructure to deliver a number that an in-database function returns in the same statement that reads the rows.

Inference inside the query is the real unlock

The most underrated benefit is compositional. When prediction is a SQL function, it composes with everything else the engine does in one statement — filter rows, rank by vector similarity, traverse a graph, and score with a model, all in a single plan the optimizer reasons about as a whole:

-- Retrieve, then score, in one optimizer plan
SELECT p.product_name,
       COSINE_SIMILARITY(p.embedding, EMBED(:query)) AS relevance,
       PREDICT(will_purchase, :user_id, p.product_id) AS purchase_likelihood
FROM products p
WHERE p.in_stock = true AND relevance > 0.7
ORDER BY purchase_likelihood DESC
LIMIT 10;

Doing this with an external pipeline means retrieving candidates from one system, shipping them to a model service, and re-joining the scores — several round-trips and a pile of glue to produce what is one query above. This is the same unification argument that runs through AI-Native Database Architecture: the value is not any single capability but the fact that they compose in one engine.

The honest caveat

In-database ML is not a universal replacement. If your problem genuinely needs large custom deep-learning models, GPU-heavy training loops, or a model lifecycle managed by a dedicated platform team, the external pipeline remains the right tool, and an AI-native database happily coexists with it — serving the operational, in-query predictions while the heavy custom models live where they belong. The point is not to eliminate the pipeline; it is to stop building a pipeline for predictions that should simply be a function call next to the data.

Try it where the data lives

The free Community Edition includes in-database AutoML and inference, installs in about 30 seconds, and lets you train a model on your own tables and call PREDICT() inside a query without standing up any serving infrastructure.

Download Free → · Read the AutoML guide → · See the live demos →