AI-Native Database RAG Functions

Published on December 20, 2025

RAG Functions - SQL Reference

Built-in Retrieval-Augmented Generation (RAG) directly in SQL queries.

Overview

SynapCores provides native RAG functions that allow you to ask natural language questions about your data and receive AI-generated insights—all within SQL.

Syntax

RAG Function

RAG(question TEXT, tables ARRAY, [options JSON]) -> TEXT

RAG_FROM_QUERY Function

RAG_FROM_QUERY(question TEXT, query TEXT, [options JSON]) -> TEXT

Basic Examples

Simple Query

SELECT RAG('What are the top products?', ARRAY['products']);

Multiple Tables

SELECT RAG(
    'Show customer purchase patterns',
    ARRAY['customers', 'orders', 'order_items']
);

Custom Options

SELECT RAG(
    'Analyze sales trends',
    ARRAY['sales'],
    '{"temperature": 0.3, "max_tokens": 500}'::JSON
);

Query Results

SELECT RAG_FROM_QUERY(
    'Summarize these statistics',
    'SELECT region, SUM(revenue) FROM sales GROUP BY region'
);

Options Reference

Option Type Default Description
system_prompt string (default) Custom system prompt
max_tokens integer 500 Max response tokens
temperature float 0.7 LLM creativity (0.0-1.0)
sample_size integer 50 Rows per table to sample
model string (provider default) LLM model name
include_schema boolean true Include table schemas
semantic_search boolean false Use vector embeddings
semantic_threshold float 0.7 Similarity threshold

Industry Use Cases

Finance - Fraud Detection

SELECT RAG(
    'Identify suspicious transaction patterns',
    ARRAY['transactions', 'accounts'],
    '{"temperature": 0.1, "sample_size": 200}'::JSON
);

Healthcare - Patient Risk Assessment

SELECT RAG(
    'What are the key risk factors for this patient?',
    ARRAY['patient_vitals', 'medications', 'lab_results'],
    '{"system_prompt": "You are a clinical decision support system"}'::JSON
);

E-Commerce - Product Recommendations

SELECT RAG(
    'Which products should we recommend to high-value customers?',
    ARRAY['customers', 'purchases', 'products']
);

Logistics - Route Optimization

SELECT RAG(
    'Which routes are experiencing delays?',
    ARRAY['shipments', 'routes', 'carriers'],
    '{"sample_size": 150}'::JSON
);

Common Patterns

Business Analysis

SELECT RAG(
    'Identify high-value customers at risk of churn',
    ARRAY['customers', 'subscriptions', 'support_tickets'],
    '{
        "system_prompt": "You are a customer success analyst",
        "temperature": 0.2
    }'::JSON
);

Data Quality Check

SELECT RAG_FROM_QUERY(
    'Are there any data quality issues?',
    'SELECT * FROM orders WHERE amount < 0 OR customer_id IS NULL'
);

Trend Analysis

SELECT RAG_FROM_QUERY(
    'What trends do you see?',
    'SELECT DATE_TRUNC(''month'', date) as month,
            COUNT(*) as count,
            AVG(value) as avg_value
     FROM metrics
     GROUP BY month
     ORDER BY month'
);

Advanced Integration

With CTEs

WITH top_customers AS (
    SELECT * FROM customers
    WHERE lifetime_value > 10000
)
SELECT RAG_FROM_QUERY(
    'Analyze these high-value customers',
    'SELECT * FROM top_customers'
);

With Joins

SELECT RAG_FROM_QUERY(
    'What are the insights?',
    'SELECT c.name, COUNT(o.order_id) as orders, SUM(o.amount) as revenue
     FROM customers c
     JOIN orders o ON c.customer_id = o.customer_id
     GROUP BY c.name
     ORDER BY revenue DESC
     LIMIT 20'
);

Custom System Prompts

Financial Analyst

{
    "system_prompt": "You are a financial analyst. Consider market conditions, seasonality, and provide specific, actionable insights with numbers."
}

Data Scientist

{
    "system_prompt": "You are a data scientist. Focus on statistical patterns, correlations, and predictive insights."
}

Business Consultant

{
    "system_prompt": "You are a business consultant. Provide strategic recommendations with ROI implications."
}

Performance Tips

  1. Reduce sample size for faster queries

    '{"sample_size": 20}'::JSON
    
  2. Skip schema for simple queries

    '{"include_schema": false}'::JSON
    
  3. Pre-filter with WHERE clauses

    SELECT RAG_FROM_QUERY('...',
        'SELECT * FROM large_table WHERE date > ''2024-01-01'' LIMIT 100'
    );
    
  4. Use lower temperature for faster, deterministic answers

    '{"temperature": 0.1}'::JSON
    

Best Practices

DO

  • Use specific, clear questions
  • Validate table names first
  • Set reasonable sample sizes
  • Use temperature 0.1-0.3 for factual answers
  • Pre-filter data with RAG_FROM_QUERY

DON'T

  • Use empty questions
  • Reference non-existent tables
  • Use DML/DDL in RAG_FROM_QUERY
  • Set sample_size > 500
  • Pass raw user input without validation

Troubleshooting

Error Solution
"Table not found" Verify table exists with SHOW TABLES
"AI service manager not available" Check LLM configuration
"Question cannot be empty" Provide a non-empty question
"Only SELECT queries allowed" Use SELECT, not INSERT/UPDATE/DELETE
Slow performance Reduce sample_size or use RAG_FROM_QUERY

Quick Command Reference

-- List available tables
SHOW TABLES;

-- Check table schema
DESCRIBE table_name;

-- Test simple RAG
SELECT RAG('What data is available?', ARRAY['table_name']);

-- Test with query
SELECT RAG_FROM_QUERY('Summarize', 'SELECT * FROM table_name LIMIT 10');

-- Test with options
SELECT RAG('Question', ARRAY['table'], '{"temperature": 0.5}'::JSON);

Document Version: 1.0 Last Updated: December 2025 Website: https://synapcores.com