Image Embedding Search
Objective
Create vector embeddings for images to enable similarity search. This allows finding visually similar images based on their content rather than metadata.
Step 1: Create Image Embeddings Table
Create a table to store images with their vector embeddings.
CREATE TABLE image_embeddings (
id INTEGER PRIMARY KEY,
filename VARCHAR(255) NOT NULL,
image IMAGE(JPEG),
description TEXT,
category VARCHAR(50),
embedding VECTOR(384),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Step 2: Insert Sample Images
Add sample images with descriptions.
INSERT INTO image_embeddings (id, filename, description, category) VALUES
(1, 'golden_retriever.jpg', 'Golden retriever dog playing in a sunny park with green grass', 'pets'),
(2, 'labrador_puppy.jpg', 'Labrador puppy sitting on grass looking at camera', 'pets'),
(3, 'cat_sleeping.jpg', 'Orange tabby cat sleeping on a soft blanket', 'pets'),
(4, 'mountain_landscape.jpg', 'Snow-capped mountain peaks at sunrise with clouds', 'nature'),
(5, 'beach_sunset.jpg', 'Tropical beach with palm trees during golden sunset', 'nature'),
(6, 'forest_trail.jpg', 'Hiking trail through dense green forest', 'nature'),
(7, 'city_skyline.jpg', 'Modern city skyline at night with lights', 'urban'),
(8, 'street_cafe.jpg', 'European street cafe with outdoor seating', 'urban'),
(9, 'red_sports_car.jpg', 'Red sports car on a mountain road', 'vehicles'),
(10, 'vintage_motorcycle.jpg', 'Classic vintage motorcycle in garage', 'vehicles');
Step 3: Generate Image Embeddings
Create embeddings from image descriptions.
UPDATE image_embeddings
SET embedding = EMBED(description)
WHERE embedding IS NULL;
Step 4: Find Similar Images by Description
Search for images similar to a text query.
SELECT
filename,
description,
category,
COSINE_SIMILARITY(embedding, EMBED('dog playing outdoors')) as similarity
FROM image_embeddings
WHERE embedding IS NOT NULL
ORDER BY similarity DESC
LIMIT 5;
Step 5: Create Image Search Table
Build a comprehensive image search table.
CREATE TABLE searchable_images (
id INTEGER PRIMARY KEY,
image_id VARCHAR(50) UNIQUE NOT NULL,
title VARCHAR(255) NOT NULL,
image IMAGE(JPEG),
description TEXT,
tags TEXT,
content_embedding VECTOR(384),
search_text TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
INSERT INTO searchable_images (id, image_id, title, description, tags) VALUES
(1, 'IMG_001', 'Sunny Beach Day', 'People enjoying a sunny day at the beach with blue water and white sand', 'beach, summer, vacation, ocean'),
(2, 'IMG_002', 'Mountain Hiking', 'Hikers on a mountain trail with scenic valley views', 'hiking, mountains, nature, adventure'),
(3, 'IMG_003', 'Coffee Shop Morning', 'Cozy coffee shop interior with warm lighting and wooden furniture', 'coffee, cafe, interior, cozy'),
(4, 'IMG_004', 'City at Night', 'Illuminated city skyline reflected in river water', 'city, night, lights, urban'),
(5, 'IMG_005', 'Garden Flowers', 'Colorful flower garden with roses and tulips in bloom', 'flowers, garden, nature, spring');
Step 6: Generate Combined Search Embeddings
Create embeddings from combined text fields.
UPDATE searchable_images
SET search_text = title || ' ' || description || ' ' || tags,
content_embedding = EMBED(title || ' ' || description || ' ' || tags)
WHERE content_embedding IS NULL;
Step 7: Semantic Image Search
Perform semantic search across images.
-- Find images related to outdoor activities
SELECT
image_id,
title,
description,
COSINE_SIMILARITY(content_embedding, EMBED('outdoor activities in nature')) as relevance
FROM searchable_images
WHERE content_embedding IS NOT NULL
ORDER BY relevance DESC
LIMIT 5;
Step 8: Find Similar Images to an Existing Image
Find images similar to a specific image in the database.
SELECT
b.filename,
b.description,
b.category,
COSINE_SIMILARITY(a.embedding, b.embedding) as similarity
FROM image_embeddings a
CROSS JOIN image_embeddings b
WHERE a.id = 1 AND b.id != 1
AND a.embedding IS NOT NULL
AND b.embedding IS NOT NULL
ORDER BY similarity DESC
LIMIT 5;
Step 9: Category-Based Similarity Search
Search within a specific category.
SELECT
filename,
description,
COSINE_SIMILARITY(embedding, EMBED('domestic pet animal')) as similarity
FROM image_embeddings
WHERE category = 'pets'
AND embedding IS NOT NULL
ORDER BY similarity DESC;
Step 10: Multi-Category Search
Find best matches across all categories.
SELECT
category,
filename,
description,
COSINE_SIMILARITY(embedding, EMBED('outdoor adventure travel')) as similarity
FROM image_embeddings
WHERE embedding IS NOT NULL
ORDER BY similarity DESC
LIMIT 3;
Step 11: Image Similarity Threshold
Find images above a similarity threshold.
SELECT
filename,
description,
category,
COSINE_SIMILARITY(embedding, EMBED('nature photography landscape')) as similarity
FROM image_embeddings
WHERE embedding IS NOT NULL
AND COSINE_SIMILARITY(embedding, EMBED('nature photography landscape')) > 0.5
ORDER BY similarity DESC;
Step 12: Embedding Statistics
Analyze embedding coverage.
SELECT
category,
COUNT(*) as total_images,
COUNT(embedding) as with_embeddings,
COUNT(*) - COUNT(embedding) as missing_embeddings
FROM image_embeddings
GROUP BY category
ORDER BY total_images DESC;
Cleanup (Optional)
DROP TABLE IF EXISTS searchable_images;
DROP TABLE IF EXISTS image_embeddings;
Expected Outcomes
- Image embeddings generated from descriptions
- Semantic similarity search functional
- Cross-image similarity comparison works
- Category filtering with similarity
- Threshold-based filtering returns relevant results
Similarity Score Interpretation
| Score Range | Interpretation |
|---|---|
| 0.9 - 1.0 | Nearly identical |
| 0.7 - 0.9 | Very similar |
| 0.5 - 0.7 | Related content |
| 0.3 - 0.5 | Somewhat related |
| < 0.3 | Unrelated |
Key Concepts Learned
- EMBED function for text-to-vector
- COSINE_SIMILARITY for comparison
- Semantic search on images
- Cross-image similarity
- Threshold-based filtering