AI-Native Database Complete Feature list

Published on January 20, 2025

SynapCores AIDB - Complete Features

For Marketing and Sales Use

Last Updated: October 2025 Version: Production Ready (v2.0)


Executive Summary

SynapCores AIDB is the world's first AI-Native Database that unifies SQL, Vector Search, and Machine Learning in a single platform. Unlike traditional databases that require external tools and complex integrations, SynapCores embeds AI capabilities directly into the database engine, enabling developers to build intelligent applications with simple SQL queries.

Key Differentiators:

  • ✅ SQL + Vector + ML in one database
  • ✅ Native embedding generation - no external services required
  • ✅ AutoML with SQL syntax - train models with a single query
  • ✅ Natural language queries - ask questions in plain English
  • ✅ Built in Rust - memory-safe and blazingly fast
  • ✅ Multi-tenant by design - enterprise-grade isolation

1. SQL Database Features

1.1 Complete SQL Support

Standard SQL Operations:

  • ✅ CREATE, DROP, ALTER tables with full DDL support
  • ✅ INSERT, UPDATE, DELETE, SELECT with full DML support
  • ✅ Complex JOIN operations (INNER, LEFT, RIGHT, FULL, CROSS)
  • ✅ Subqueries and nested queries
  • ✅ Common Table Expressions (CTEs) with WITH clause
  • ✅ Recursive CTEs for hierarchical data
  • ✅ UNION, INTERSECT, EXCEPT set operations
  • ✅ Window functions (ROW_NUMBER, RANK, LAG, LEAD, etc.)
  • ✅ GROUP BY with HAVING clauses
  • ✅ ORDER BY with multiple columns and directions

Data Types:

  • Standard: INTEGER, BIGINT, REAL, DOUBLE, TEXT, VARCHAR, CHAR, BOOLEAN
  • Advanced: JSON, JSONB, UUID, TIMESTAMP, DATE, TIME, DECIMAL, BYTEA
  • AI-Native: VECTOR(dimensions), AUDIO, VIDEO, IMAGE, PDF

Constraints & Indexing:

  • ✅ PRIMARY KEY, FOREIGN KEY, UNIQUE, NOT NULL, CHECK constraints
  • ✅ DEFAULT values and AUTO INCREMENT
  • ✅ B-tree indexes for standard columns
  • ✅ Vector indexes (HNSW) for similarity search
  • ✅ Composite indexes across multiple columns

1.2 Advanced SQL Features

Stored Procedures:

  • ✅ CREATE PROCEDURE with IN/OUT/INOUT parameters
  • ✅ Control flow: IF/THEN/ELSE, WHILE loops, FOR loops
  • ✅ Variable declarations and assignments
  • ✅ Error handling with RAISE statements
  • ✅ HTTP requests to external APIs
  • ✅ Shell command execution (with security controls)
  • ✅ Procedure overloading by parameter types

Triggers:

  • ✅ BEFORE/AFTER INSERT/UPDATE/DELETE triggers
  • ✅ Row-level and statement-level triggers
  • ✅ Access to NEW and OLD row values
  • ✅ Conditional triggers with WHEN clauses
  • ✅ Trigger chaining and execution ordering
  • ✅ Persistent trigger storage with multi-tenant isolation

Views:

  • ✅ CREATE VIEW for virtual tables
  • ✅ Updatable views
  • ✅ View dependency tracking
  • ✅ Persistent view definitions

Transactions:

  • ✅ BEGIN, COMMIT, ROLLBACK support
  • ✅ ACID compliance with MVCC
  • ✅ Isolation levels: Read Committed, Repeatable Read, Serializable
  • ✅ Savepoints for partial rollbacks

Window Functions (Production Ready):

  • ✅ ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE()
  • ✅ LAG(), LEAD() for accessing adjacent rows
  • ✅ FIRST_VALUE(), LAST_VALUE(), NTH_VALUE()
  • ✅ Running aggregates: SUM, AVG, COUNT, MIN, MAX
  • ✅ PARTITION BY and ORDER BY support
  • ✅ Custom frame specifications (ROWS, RANGE)

Statistical Functions:

  • ✅ STDDEV(), VARIANCE() for statistical analysis
  • ✅ PERCENTILE_CONT(), PERCENTILE_DISC() for percentiles
  • ✅ CORR(), COVAR_POP(), COVAR_SAMP() for correlations
  • ✅ REGR_* functions for linear regression

2. Vector Database Capabilities

2.1 Vector Storage & Retrieval

Vector Data Type:

  • ✅ Native VECTOR(dimensions) data type in SQL
  • ✅ Support for dimensions up to 4096
  • ✅ Optimized contiguous memory storage
  • ✅ Automatic normalization options

Distance Metrics:

  • Cosine Similarity - for text and semantic search (recommended)
  • Euclidean Distance (L2) - for spatial data and image embeddings
  • Dot Product / Inner Product - for recommendation systems
  • Manhattan Distance (L1) - for high-dimensional sparse data

Vector Operators in SQL:

-- Cosine distance
embedding <=> query_vector

-- Euclidean distance
embedding <-> query_vector

-- Inner product
embedding <#> query_vector

2.2 Vector Indexing

HNSW Index (Approximate Nearest Neighbor):

  • ✅ Graph-based index structure for fast similarity search
  • ✅ Configurable parameters: M (connections), ef_construction, ef_search
  • ✅ O(log n) search complexity
  • ✅ 10-100x faster than brute force for large datasets
  • ✅ Tunable trade-off between speed and accuracy

Flat Index (Exact Nearest Neighbor):

  • ✅ Brute-force exact search
  • ✅ 100% recall accuracy
  • ✅ Ideal for small datasets (< 10K vectors)

Performance:

  • ✅ Search 1M vectors in ~30 microseconds (HNSW)
  • ✅ 128d vectors: ~0.5ms per search
  • ✅ 512d vectors: ~2ms per search
  • ✅ Batch operations: 10,000+ insertions/second

2.3 Embedding Generation

Built-in Models (No External API Required):

  • MiniLM (384d) - Fast, general-purpose (recommended)
  • BERT Base (768d) - High-quality semantic embeddings
  • BERT Large (1024d) - Maximum quality for critical applications

Embedding Functions in SQL:

-- Generate embedding from text
SELECT EMBED('wireless headphones');
SELECT EMBED(product_description);

-- Specify model
SELECT EMBED(text, 'minilm');
SELECT EMBED(text, 'bert-base');

Features:

  • ✅ Local inference - no external API calls
  • ✅ Model caching for fast startup
  • ✅ Batch processing for high throughput
  • ✅ GPU acceleration support (CUDA)
  • ✅ Automatic text tokenization

2.4 Semantic Search

Similarity Functions:

-- Find similar products
SELECT product_name,
       COSINE_SIMILARITY(embedding, EMBED('laptop')) as score
FROM products
WHERE COSINE_SIMILARITY(embedding, EMBED('laptop')) > 0.7
ORDER BY score DESC
LIMIT 10;

Features:

  • ✅ Sub-second response times
  • ✅ Metadata filtering during search
  • ✅ Hybrid search (vector + keyword)
  • ✅ Multi-vector queries
  • ✅ Threshold-based filtering

3. AI & ML Functions

3.1 Natural Language Processing

Text Analysis:

-- Sentiment analysis
SELECT SENTIMENT_ANALYSIS(review_text) FROM reviews;

-- Extract named entities
SELECT EXTRACT_ENTITIES(document_text) FROM documents;

-- Keyword extraction
SELECT EXTRACT_KEYWORDS(article_content, 5) FROM articles;

-- Text summarization
SELECT SUMMARIZE(long_text, 200) FROM articles;

Text Generation:

-- Generate text with AI
SELECT GENERATE('Write a product description for'||product_name, options)
FROM product_templates;

3.2 Multimedia AI Functions

Audio Processing:

-- Transcribe audio to text
SELECT TRANSCRIBE(audio_file) FROM recordings;

-- Get audio duration
SELECT DURATION(audio_file) FROM media;

Video Processing:

-- Extract frames from video
SELECT EXTRACT_FRAMES(video_file, 1000) FROM videos;

-- Extract audio track
SELECT EXTRACT_AUDIO(video_file) FROM videos;

-- Get video metadata
SELECT DIMENSIONS(video_file) FROM videos;

Image Processing:

-- Extract text from images (OCR)
SELECT EXTRACT_TEXT(image_file) FROM scanned_documents;

-- Get image dimensions
SELECT DIMENSIONS(image_file) FROM photos;

-- Generate image embeddings
SELECT EMBED(image_file) FROM photos;

PDF Processing:

-- Extract text from PDF
SELECT EXTRACT_TEXT(pdf_document) FROM contracts;

3.3 Vector Operations

Vector Algebra:

-- Vector arithmetic
SELECT VECTOR_ADD(vec1, vec2);
SELECT VECTOR_SUBTRACT(vec1, vec2);
SELECT VECTOR_MULTIPLY(vec, 2.5);

-- Vector analysis
SELECT VECTOR_MAGNITUDE(embedding);
SELECT VECTOR_NORMALIZE(embedding);
SELECT VECTOR_DOT(vec1, vec2);

Similarity Metrics:

SELECT COSINE_SIMILARITY(vec1, vec2);
SELECT EUCLIDEAN_DISTANCE(vec1, vec2);
SELECT INNER_PRODUCT(vec1, vec2);

4. AutoML Capabilities

4.1 SQL-Based Machine Learning

Create ML Experiments:

-- Train a classification model
CREATE EXPERIMENT churn_prediction AS
SELECT customer_id, age, tenure, monthly_charges, churned
FROM customers
WITH (
    task_type='binary_classification',
    target_column='churned',
    max_trials=50,
    algorithms=['random_forest', 'xgboost', 'neural_network']
);

Supported ML Tasks:

  • Classification (binary and multi-class)
  • Regression (continuous value prediction)
  • Time Series Forecasting
  • Clustering (unsupervised learning)
  • Anomaly Detection

4.2 ML Algorithms

Available Algorithms:

  • ✅ Linear Regression / Logistic Regression
  • ✅ Decision Trees
  • ✅ Random Forest
  • ✅ Gradient Boosting
  • ✅ XGBoost
  • ✅ Neural Networks (multi-layer perceptron)
  • ✅ K-Nearest Neighbors (KNN)
  • ✅ Support Vector Machines (SVM)
  • ✅ Naive Bayes

Algorithm Selection Strategies:

  • All - try all available algorithms
  • Fast - only fast algorithms (linear, decision trees, naive bayes, knn)
  • Accurate - highly accurate algorithms (random forest, gradient boosting, xgboost, neural networks)
  • Interpretable - explainable models (linear regression, logistic regression, decision trees)
  • Custom List - specify exact algorithms to try

4.3 AutoML Features

Automatic Feature Engineering:

  • ✅ Polynomial feature generation
  • ✅ Interaction features between variables
  • ✅ Automatic scaling (standard, minmax, robust)
  • ✅ Missing value imputation (mean, median, mode, forward/backward fill)
  • ✅ Categorical encoding (one-hot, label, target, ordinal)

Model Training:

  • ✅ Automatic hyperparameter tuning
  • ✅ Cross-validation (K-Fold, Stratified K-Fold, Time Series Split)
  • ✅ Early stopping to prevent overfitting
  • ✅ Ensemble models for better accuracy
  • ✅ Configurable time budget and trial limits

Model Management:

-- View experiments
SHOW MODELS;

-- Start/stop experiments
START EXPERIMENT experiment_name;
STOP EXPERIMENT experiment_name;

-- Describe experiment results
DESCRIBE EXPERIMENT experiment_name;

-- Deploy best model
DEPLOY MODEL best_model FROM EXPERIMENT churn_prediction;

-- Make predictions
PREDICT churn_probability
USING churn_model
AS SELECT * FROM new_customers;

Performance Metrics:

  • Classification: Accuracy, Precision, Recall, F1 Score, AUC
  • Regression: R², MAE, MSE, RMSE, MAPE
  • Clustering: Silhouette Score, Davies-Bouldin Index
  • Anomaly Detection: F1 Score, Precision, Recall

5. Natural Language to SQL (NL2SQL)

5.1 Conversational Queries

Ask Questions in Plain English:

ASK "Show me all customers who made purchases last month"
ASK "What are the top 5 products by revenue?"
ASK "Find similar products to laptop computers"

Features:

  • ✅ LLM-powered query understanding
  • ✅ Schema-aware query generation
  • ✅ Intent analysis (select, aggregate, join, time-series)
  • ✅ Confidence scoring (0.0-1.0)
  • ✅ Alternative query suggestions
  • ✅ Query explanations in natural language

5.2 Intelligent Query Generation

Schema Context Awareness:

  • ✅ Automatic table and column name inference
  • ✅ Relationship detection (foreign keys, joins)
  • ✅ Domain-specific terminology mapping
  • ✅ Common aliases and synonyms

Query Patterns:

  • Aggregation queries ("total sales by category")
  • Ranking queries ("top 10 customers by spend")
  • Comparison queries ("compare revenue year over year")
  • Trend analysis ("sales trend over time")
  • Semantic search ("find products similar to X")

Dual-Mode Processing:

  • LLM-based - uses AI service for complex queries
  • Pattern-based - fast rule-based generation for simple queries
  • ✅ Automatic fallback if LLM unavailable

6. Query Optimization

6.1 Cost-Based Optimizer

Optimization Levels:

  • Level 0: No optimization
  • Level 1: Rule-based optimization only
  • Level 2: Cost-based optimization (default, recommended)
  • Level 3: Advanced optimizations with verbose logging

Phase 1 Rules (Always Applied):

  • ✅ Constant folding - evaluates expressions at compile time
  • ✅ Predicate simplification - simplifies WHERE conditions
  • ✅ Predicate pushdown - moves filters closer to data source
  • ✅ Projection pushdown - selects only needed columns early
  • ✅ Merge filters - combines multiple WHERE clauses

Phase 2 Rules (Cost-Based):

  • ✅ Join reordering - optimal join order with cost estimation
  • ✅ Index selection - chooses best indexes automatically
  • ✅ Join algorithm selection - picks hash/merge/nested loop joins

Phase 3 Rules (Advanced):

  • ✅ Common subexpression elimination - avoids recomputing expressions
  • ✅ Subquery decorrelation - optimizes correlated subqueries
  • ✅ Window function optimization
  • ✅ Aggregate optimization

6.2 Index Advisor

Automatic Index Recommendations:

  • ✅ Analyzes query workload
  • ✅ Suggests indexes for slow queries
  • ✅ Estimates performance improvement
  • ✅ Considers index maintenance cost

6.3 Query Plan Caching

Execution Plan Caching:

  • ✅ Saves compiled query plans
  • ✅ Cache invalidation on schema changes
  • ✅ Plan reuse for repeated queries
  • ✅ Reduces compilation overhead

Statistics Collection:

  • ✅ Table size tracking
  • ✅ Index usage statistics
  • ✅ Query performance metrics
  • ✅ Cardinality estimation

7. Enterprise Features

7.1 Multi-Tenancy

Complete Tenant Isolation:

  • ✅ Database-level isolation
  • ✅ Separate encryption keys per tenant
  • ✅ Storage isolation in RocksDB
  • ✅ Query execution isolation
  • ✅ No cross-tenant data access possible

Tenant Management:

  • ✅ Automatic tenant provisioning
  • ✅ Per-tenant resource quotas
  • ✅ Tenant-specific configuration
  • ✅ Usage tracking per tenant

7.2 Security & Authentication

API Security:

  • ✅ JWT-based authentication
  • ✅ Role-based access control (RBAC)
  • ✅ Row-level security policies
  • ✅ Column-level permissions
  • ✅ Encrypted connections (TLS/SSL)

Procedure Security:

  • ✅ Shell command whitelisting
  • ✅ HTTP domain restrictions
  • ✅ Execution timeouts
  • ✅ Configurable security policies

7.3 REST API

Comprehensive HTTP API:

  • /v1/query/* - SQL query execution
  • /v1/vectors/* - Vector operations
  • /v1/ai/* - AI function access
  • /v1/automl/* - AutoML operations
  • /v1/auth/* - Authentication
  • /v1/users/* - User management
  • /v1/templates/* - Query templates
  • /v1/recipes/* - Recipe execution

WebSocket Support:

  • ✅ Real-time query subscriptions
  • ✅ Live telemetry updates
  • ✅ Streaming query results
  • ✅ Long-running query monitoring

7.4 Monitoring & Telemetry

Built-in Metrics:

  • ✅ Query execution times
  • ✅ Cache hit/miss rates
  • ✅ Index usage statistics
  • ✅ Resource utilization (CPU, memory)
  • ✅ Tenant-level analytics

Query Performance:

  • ✅ Slow query logging
  • ✅ Query plan inspection with EXPLAIN
  • ✅ Execution statistics per query
  • ✅ Performance trending over time

8. Developer Experience

8.1 SQL Dialect

AIDB SQL Syntax:

  • ✅ Standard SQL compliance
  • ✅ AI function extensions
  • ✅ Vector operators
  • ✅ Multimedia data types
  • ✅ Natural language query support

Query Templates (Recipes):

  • ✅ Reusable parameterized queries
  • ✅ Template library management
  • ✅ AI-generated templates
  • ✅ Template execution API

8.2 Client SDKs

Connection Options:

  • ✅ Native socket protocol (synapcores://)
  • ✅ REST API (https://api.synapcores.com)
  • ✅ WebSocket for real-time updates

Language Support (Client SDKs Available):

  • ✅ Python
  • ✅ Node.js / TypeScript
  • ✅ Rust (native)
  • ✅ cURL / HTTP

8.3 Error Handling

Comprehensive Error Messages:

  • ✅ Clear error descriptions
  • ✅ Line and column numbers for SQL errors
  • ✅ Suggestions for fixes
  • ✅ Error codes for programmatic handling

9. Performance & Scalability

9.1 Storage Engine

RocksDB Foundation:

  • ✅ LSM-tree based storage
  • ✅ Efficient write throughput
  • ✅ Automatic compaction
  • ✅ Optimized for SSD storage

Memory Management:

  • ✅ Three-tier caching (hot, warm, cold)
  • ✅ Adaptive cache sizing
  • ✅ LRU eviction policies
  • ✅ Compression for cold data

9.2 Query Execution

Execution Engine:

  • ✅ Vectorized execution
  • ✅ Parallel query processing
  • ✅ Streaming results
  • ✅ Memory-efficient aggregation

Batch Operations:

  • ✅ Bulk inserts (10,000+ rows/second)
  • ✅ Batch updates and deletes
  • ✅ Parallel vector insertions
  • ✅ Streaming result sets

9.3 Scalability

Current Capabilities:

  • ✅ Handles millions of rows
  • ✅ 1M+ vectors per collection
  • ✅ Concurrent query execution
  • ✅ Multi-tenant workload isolation

Resource Efficiency:

  • ✅ Built in Rust - minimal memory overhead
  • ✅ Zero-copy optimizations
  • ✅ Efficient binary protocols
  • ✅ Smart caching strategies

10. Unique Advantages vs. Competitors

10.1 vs. PostgreSQL + pgvector

SynapCores Advantages:

  • ✅ Native AI functions (no extensions required)
  • ✅ Built-in embedding generation (no external API)
  • ✅ AutoML with SQL syntax
  • ✅ Natural language queries
  • ✅ Optimized vector indexing (HNSW native)
  • ✅ AI-first architecture (not bolted on)

10.2 vs. Pinecone / Weaviate / Qdrant

SynapCores Advantages:

  • ✅ Full SQL support (not just vector operations)
  • ✅ ACID transactions
  • ✅ Complex joins and aggregations
  • ✅ AutoML for any tabular data
  • ✅ Stored procedures and triggers
  • ✅ Single database for all data types

10.3 vs. MongoDB + Atlas Vector Search

SynapCores Advantages:

  • ✅ Structured SQL (not document-based)
  • ✅ Strong consistency (ACID)
  • ✅ Better join performance
  • ✅ Native AutoML capabilities
  • ✅ Cost-effective vector storage
  • ✅ No vendor lock-in to cloud provider

10.4 vs. Snowflake

SynapCores Advantages:

  • ✅ Native vector search
  • ✅ Real-time embeddings generation
  • ✅ Local ML inference (no Snowflake Cortex fees)
  • ✅ Sub-second response times
  • ✅ Built-in AutoML
  • ✅ More cost-effective for AI workloads

11. Use Cases

11.1 AI-Powered Applications

Semantic Search:

  • Product recommendation engines
  • Document similarity search
  • Customer support chatbots
  • Content discovery platforms

Retrieval-Augmented Generation (RAG):

  • Knowledge base Q&A systems
  • Context-aware chatbots
  • Document analysis tools
  • Legal/medical research assistants

11.2 Predictive Analytics

Customer Intelligence:

  • Churn prediction
  • Customer lifetime value forecasting
  • Next best action recommendations
  • Segmentation and clustering

Business Forecasting:

  • Sales predictions
  • Demand forecasting
  • Inventory optimization
  • Revenue projections

11.3 Real-Time AI

Anomaly Detection:

  • Fraud detection in transactions
  • Network intrusion detection
  • Quality control in manufacturing
  • System health monitoring

Image & Video AI:

  • Visual search engines
  • Content moderation
  • Face recognition systems
  • Object detection applications

11.4 Enterprise Data Management

Unified Data Platform:

  • Replace multiple databases with one
  • Eliminate ETL between systems
  • Single source of truth for all data
  • Simplified architecture

12. SaaS Deployment Model

12.1 Service Tiers

Available Plans:

  • Starter: For prototypes and small apps
  • Professional: For production applications
  • Enterprise: For mission-critical workloads

Included:

  • ✅ Fully managed infrastructure
  • ✅ Automatic backups and disaster recovery
  • ✅ 99.9% uptime SLA
  • ✅ 24/7 support
  • ✅ Security and compliance
  • ✅ Automatic updates and maintenance

12.2 API Access

Connection Methods:

# Python SDK
import synapcores
conn = synapcores.connect("synapcores://username:password@your-instance.synapcores.com:5433/your_database")
# REST API
curl -X POST https://api.synapcores.com/api/v1/query \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"sql": "SELECT * FROM products LIMIT 10"}'

12.3 Support & Resources

Support Channels:


13. Pricing Model

13.1 Transparent Pricing

Based On:

  • ✅ Data storage (per GB)
  • ✅ Query compute (per query-second)
  • ✅ Vector operations (per 1K searches)
  • ✅ ML training (per experiment hour)

No Hidden Costs:

  • ✅ No per-API-call charges
  • ✅ No embedding generation fees
  • ✅ No data transfer fees within region
  • ✅ No vendor lock-in

13.2 Free Trial

14-Day Free Trial:

  • ✅ Full feature access
  • ✅ No credit card required
  • ✅ 10 GB storage included
  • ✅ 1M vector operations
  • ✅ Dedicated support during trial

14. Technical Specifications

14.1 System Requirements (Informational Only - Managed by SynapCores)

Infrastructure:

  • Written in Rust for performance and safety
  • Runs on Linux-based cloud infrastructure
  • Utilizes RocksDB for persistent storage
  • Supports GPU acceleration for ML inference

Capacity Limits (Per Tenant):

  • Tables: Unlimited
  • Rows per table: Billions
  • Vectors per collection: 100M+
  • Concurrent connections: 1,000+
  • Query result size: Configurable

14.2 Data Storage

Storage Features:

  • ✅ Automatic compression
  • ✅ Incremental backups
  • ✅ Point-in-time recovery
  • ✅ Replication across availability zones

15. Compliance & Security

Security Certifications:

  • ✅ SOC 2 Type II (in progress)
  • ✅ GDPR compliant
  • ✅ HIPAA compliant (Enterprise)
  • ✅ Encryption at rest and in transit
  • ✅ Regular security audits

Data Privacy:

  • ✅ Data residency options
  • ✅ Customer-owned encryption keys (Enterprise)
  • ✅ Audit logging
  • ✅ Access control and monitoring

Summary

SynapCores AIDB is a production-ready, AI-native database that combines:

  • Complete SQL database with ACID transactions
  • Vector database with HNSW indexing and semantic search
  • AutoML platform with SQL-based model training
  • Natural language interface for conversational queries
  • Multimedia AI for images, audio, video, and PDFs
  • Enterprise features with multi-tenancy and security
  • Fully managed SaaS - no infrastructure management

Perfect For: Teams building AI applications who want to eliminate the complexity of managing multiple databases, ML platforms, and vector stores.

Get Started: Visit https://synapcores.com or contact sales@synapcores.com


Document Version: 1.0 Last Updated: January 2025 Contact: sales@synapcores.com Website: https://synapcores.com