Video Content Search
Objective
Implement intelligent video search using embeddings, transcripts, and metadata. This enables semantic search, content discovery, and recommendation systems for video platforms.
Step 1: Create Searchable Videos Table
Create a table for videos with search fields.
CREATE TABLE searchable_videos (
id INTEGER PRIMARY KEY,
title VARCHAR(255) NOT NULL,
video_file VIDEO(MP4),
description TEXT,
transcript TEXT,
tags VARCHAR(500),
category VARCHAR(50),
creator VARCHAR(100),
duration_seconds INTEGER,
view_count INTEGER DEFAULT 0,
title_embedding VECTOR(384),
description_embedding VECTOR(384),
transcript_embedding VECTOR(384),
is_searchable BOOLEAN DEFAULT TRUE,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Step 2: Create Search Index Table
Store pre-computed search data.
CREATE TABLE video_search_index (
id INTEGER PRIMARY KEY,
video_id INTEGER NOT NULL,
search_field VARCHAR(50) NOT NULL,
content TEXT NOT NULL,
embedding VECTOR(384),
weight DECIMAL(3, 2) DEFAULT 1.00,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (video_id) REFERENCES searchable_videos(id)
);
Step 3: Insert Sample Videos
Add videos with rich metadata.
INSERT INTO searchable_videos (id, title, description, transcript, tags, category, creator, duration_seconds, view_count) VALUES
(1, 'Machine Learning Fundamentals',
'Introduction to machine learning concepts including supervised and unsupervised learning algorithms',
'Welcome to this introduction to machine learning. Today we will cover the basics of supervised learning, where we train models using labeled data...',
'machine learning,AI,algorithms,data science',
'Education', 'AI Academy', 2400, 150000),
(2, 'Python Data Analysis Tutorial',
'Learn how to analyze data using Python pandas library with real-world examples',
'In this tutorial we will explore data analysis using Python and pandas. First lets import our libraries and load a dataset...',
'python,pandas,data analysis,programming',
'Education', 'Code School', 1800, 95000),
(3, 'Deep Neural Networks Explained',
'Understanding deep learning architectures and neural network layers',
'Deep neural networks have revolutionized artificial intelligence. In this video we examine the architecture of neural networks...',
'deep learning,neural networks,AI,architecture',
'Education', 'AI Academy', 3000, 120000),
(4, 'Web Development with React',
'Building modern web applications using React and JavaScript',
'React is a powerful library for building user interfaces. Today we will create a complete web application from scratch...',
'react,javascript,web development,frontend',
'Education', 'Web Masters', 2100, 85000),
(5, 'Database Design Best Practices',
'Learn database normalization and schema design patterns',
'Good database design is crucial for application performance. We will discuss normalization rules and when to denormalize...',
'database,SQL,design patterns,performance',
'Education', 'Data Experts', 1500, 65000),
(6, 'Cloud Computing Basics',
'Introduction to cloud services including AWS, Azure, and GCP',
'Cloud computing has transformed how we build and deploy applications. This video covers the major cloud providers...',
'cloud,AWS,Azure,GCP,infrastructure',
'Technology', 'Cloud Academy', 2700, 110000),
(7, 'API Design and REST Principles',
'Creating well-designed RESTful APIs for modern applications',
'RESTful APIs are the backbone of modern web services. We will explore best practices for designing scalable APIs...',
'API,REST,web services,design',
'Technology', 'API Experts', 1200, 55000),
(8, 'Docker Containerization',
'Understanding Docker containers and container orchestration',
'Containers have changed software deployment. This tutorial covers Docker basics and container management...',
'docker,containers,devops,deployment',
'Technology', 'DevOps Pro', 1800, 75000);
Step 4: Generate Embeddings
Create embeddings for searchable content.
UPDATE searchable_videos
SET title_embedding = EMBED(title),
description_embedding = EMBED(description),
transcript_embedding = EMBED(transcript)
WHERE title_embedding IS NULL;
Step 5: Build Search Index
Populate the search index.
INSERT INTO video_search_index (id, video_id, search_field, content, embedding, weight)
SELECT
ROW_NUMBER() OVER () as id,
id as video_id,
'combined' as search_field,
title || ' ' || description || ' ' || COALESCE(tags, '') as content,
EMBED(title || ' ' || description || ' ' || COALESCE(tags, '')) as embedding,
1.00 as weight
FROM searchable_videos
WHERE is_searchable = TRUE;
Step 6: Semantic Search
Search videos using natural language.
SELECT
sv.title,
sv.category,
sv.creator,
sv.duration_seconds / 60 as minutes,
sv.view_count,
COSINE_SIMILARITY(sv.title_embedding, EMBED('learning artificial intelligence and machine learning')) as relevance
FROM searchable_videos sv
WHERE sv.is_searchable = TRUE
ORDER BY relevance DESC
LIMIT 5;
Step 7: Multi-Field Search
Search across multiple fields with weighting.
SELECT
sv.title,
COSINE_SIMILARITY(sv.title_embedding, EMBED('python programming tutorials')) * 0.4 +
COSINE_SIMILARITY(sv.description_embedding, EMBED('python programming tutorials')) * 0.35 +
COSINE_SIMILARITY(sv.transcript_embedding, EMBED('python programming tutorials')) * 0.25 as combined_score
FROM searchable_videos sv
WHERE sv.is_searchable = TRUE
ORDER BY combined_score DESC
LIMIT 5;
Step 8: Search with Filters
Combine semantic search with filters.
SELECT
sv.title,
sv.category,
sv.creator,
sv.view_count,
COSINE_SIMILARITY(sv.description_embedding, EMBED('data analysis and visualization')) as relevance
FROM searchable_videos sv
WHERE sv.is_searchable = TRUE
AND sv.category = 'Education'
AND sv.duration_seconds <= 2400
ORDER BY relevance DESC
LIMIT 5;
Step 9: Find Similar Videos
Get videos similar to a specific one.
SELECT
b.title,
b.category,
b.creator,
COSINE_SIMILARITY(a.description_embedding, b.description_embedding) as similarity
FROM searchable_videos a
CROSS JOIN searchable_videos b
WHERE a.id = 1
AND b.id != 1
AND a.description_embedding IS NOT NULL
AND b.description_embedding IS NOT NULL
ORDER BY similarity DESC
LIMIT 5;
Step 10: Tag-Based Discovery
Search using tags.
SELECT
title,
tags,
category,
view_count
FROM searchable_videos
WHERE tags LIKE '%machine learning%'
OR tags LIKE '%AI%'
ORDER BY view_count DESC;
Step 11: Search Analytics
Track search result quality.
SELECT
sv.category,
COUNT(*) as video_count,
AVG(COSINE_SIMILARITY(sv.description_embedding, EMBED('programming tutorials'))) as avg_relevance
FROM searchable_videos sv
WHERE sv.is_searchable = TRUE
GROUP BY sv.category
ORDER BY avg_relevance DESC;
Step 12: Personalized Recommendations
Build recommendations based on watched videos.
SELECT
'Based on: Machine Learning Fundamentals' as reason,
recommend.title,
recommend.category,
recommend.view_count,
COSINE_SIMILARITY(source.description_embedding, recommend.description_embedding) as match_score
FROM searchable_videos source
CROSS JOIN searchable_videos recommend
WHERE source.id = 1
AND recommend.id != 1
AND source.description_embedding IS NOT NULL
AND recommend.description_embedding IS NOT NULL
AND COSINE_SIMILARITY(source.description_embedding, recommend.description_embedding) > 0.5
ORDER BY match_score DESC
LIMIT 5;
Cleanup (Optional)
DROP TABLE IF EXISTS video_search_index;
DROP TABLE IF EXISTS searchable_videos;
Expected Outcomes
- Semantic search finds relevant videos
- Multi-field search weights properly
- Filters narrow results
- Similar video discovery works
- Recommendations generated
Search Weights
| Field | Suggested Weight |
|---|---|
| Title | 0.40 |
| Description | 0.35 |
| Transcript | 0.25 |
Key Concepts Learned
- Vector embeddings for video
- Semantic search implementation
- Multi-field search weighting
- Similar content discovery
- Recommendation systems