Vector Column Basics
Objective
Create a table with VECTOR type columns for storing embeddings. Vectors are fundamental to AI-native features like semantic search and similarity matching in AIDB.
Step 1: Create a Table with Vector Column
Create a table with a vector column for storing embeddings.
CREATE TABLE product_embeddings (
id INTEGER PRIMARY KEY,
product_name VARCHAR(255) NOT NULL,
description TEXT,
embedding VECTOR(384),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Step 2: Create Tables with Different Vector Dimensions
Different embedding models produce different dimensions.
-- MiniLM embeddings (384 dimensions)
CREATE TABLE minilm_embeddings (
id INTEGER PRIMARY KEY,
content TEXT NOT NULL,
embedding VECTOR(384)
);
-- BERT Base embeddings (768 dimensions)
CREATE TABLE bert_base_embeddings (
id INTEGER PRIMARY KEY,
content TEXT NOT NULL,
embedding VECTOR(768)
);
-- BERT Large embeddings (1024 dimensions)
CREATE TABLE bert_large_embeddings (
id INTEGER PRIMARY KEY,
content TEXT NOT NULL,
embedding VECTOR(1024)
);
-- OpenAI-style embeddings (1536 dimensions)
CREATE TABLE openai_embeddings (
id INTEGER PRIMARY KEY,
content TEXT NOT NULL,
embedding VECTOR(1536)
);
Step 3: Insert Records with Embeddings
Insert sample data and generate embeddings using the EMBED function.
INSERT INTO product_embeddings (id, product_name, description) VALUES
(1, 'Wireless Headphones', 'Premium noise-canceling wireless headphones with 30-hour battery life'),
(2, 'Bluetooth Speaker', 'Portable waterproof speaker with deep bass and 360-degree sound'),
(3, 'USB Microphone', 'Professional condenser microphone for streaming and podcasting'),
(4, 'Webcam HD', 'Full HD webcam with auto-focus and built-in microphone'),
(5, 'Gaming Mouse', 'Ergonomic gaming mouse with customizable RGB lighting');
Step 4: Generate Embeddings
Use the EMBED function to create vector embeddings from text.
UPDATE product_embeddings
SET embedding = EMBED(description)
WHERE embedding IS NULL;
Step 5: Query with Embedding Generation
Select products and generate embeddings inline.
SELECT
id,
product_name,
EMBED(description) as generated_embedding
FROM product_embeddings
LIMIT 3;
Step 6: Create a Document Search Table
Create a practical document search table with vectors.
CREATE TABLE document_search (
id INTEGER PRIMARY KEY,
title VARCHAR(255) NOT NULL,
content TEXT NOT NULL,
content_embedding VECTOR(384),
category VARCHAR(100),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
INSERT INTO document_search (id, title, content, category) VALUES
(1, 'Getting Started Guide', 'Learn how to set up your development environment and create your first project', 'tutorial'),
(2, 'API Reference', 'Complete documentation of all available API endpoints and parameters', 'reference'),
(3, 'Best Practices', 'Recommended patterns and practices for building scalable applications', 'guide'),
(4, 'Troubleshooting', 'Common issues and their solutions when working with the platform', 'support');
UPDATE document_search
SET content_embedding = EMBED(content)
WHERE content_embedding IS NULL;
Step 7: Verify Vector Storage
Check that vectors are stored correctly.
SELECT
id,
title,
category,
CASE
WHEN content_embedding IS NOT NULL THEN 'Has Embedding'
ELSE 'No Embedding'
END as embedding_status
FROM document_search;
Cleanup (Optional)
DROP TABLE IF EXISTS product_embeddings;
DROP TABLE IF EXISTS minilm_embeddings;
DROP TABLE IF EXISTS bert_base_embeddings;
DROP TABLE IF EXISTS bert_large_embeddings;
DROP TABLE IF EXISTS openai_embeddings;
DROP TABLE IF EXISTS document_search;
Expected Outcomes
- Vector columns created with specified dimensions
- EMBED() function generates embeddings from text
- Embeddings stored in vector columns
- Different dimension sizes for different models
Common Vector Dimensions
| Model | Dimensions | Use Case |
|---|---|---|
| MiniLM | 384 | General-purpose, fast |
| BERT Base | 768 | High-quality semantic search |
| BERT Large | 1024 | Maximum quality |
| OpenAI | 1536 | API-compatible embeddings |
Key Concepts Learned
- VECTOR(n) type for storing embeddings
- Different embedding dimensions for different models
- EMBED() function for generating embeddings
- Vectors enable semantic search capabilities