PDF Document Storage
Objective
Learn how to store PDF documents with metadata in SynapCores. This foundation enables building document management systems, digital archives, and searchable document repositories.
Step 1: Create Documents Table
Create a table for PDF storage.
CREATE TABLE pdf_documents (
id INTEGER PRIMARY KEY,
title VARCHAR(255) NOT NULL,
document PDF,
description TEXT,
category VARCHAR(50),
file_size BIGINT,
page_count INTEGER,
author VARCHAR(100),
created_date DATE,
uploaded_by VARCHAR(100),
is_public BOOLEAN DEFAULT FALSE,
version INTEGER DEFAULT 1,
status VARCHAR(20) DEFAULT 'active',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Step 2: Insert Sample Documents
Add sample PDF records.
INSERT INTO pdf_documents (id, title, description, category, file_size, page_count, author, created_date, uploaded_by, is_public) VALUES
(1, 'Annual Report 2024', 'Company annual financial report', 'financial', 2457600, 45, 'Finance Team', '2024-01-15', 'finance_admin', TRUE),
(2, 'Employee Handbook', 'Company policies and procedures', 'hr', 1536000, 32, 'HR Department', '2023-06-01', 'hr_admin', FALSE),
(3, 'Product Specification', 'Technical specifications for Product X', 'technical', 768000, 15, 'Engineering', '2024-02-10', 'eng_lead', FALSE),
(4, 'Marketing Plan Q1', 'Quarterly marketing strategy document', 'marketing', 1024000, 22, 'Marketing Team', '2024-01-05', 'marketing_mgr', FALSE),
(5, 'Training Manual', 'New employee onboarding guide', 'training', 2048000, 58, 'Training Dept', '2023-09-15', 'training_admin', FALSE),
(6, 'Legal Contract Template', 'Standard service agreement template', 'legal', 512000, 8, 'Legal Team', '2023-11-20', 'legal_admin', FALSE),
(7, 'Research Summary', 'Industry research findings summary', 'research', 1280000, 28, 'Research Team', '2024-02-01', 'research_lead', TRUE);
Step 3: Query All Documents
Retrieve all documents with formatted information.
SELECT
title,
category,
page_count,
file_size / 1024 as size_kb,
author,
created_date
FROM pdf_documents
ORDER BY created_date DESC;
Step 4: Filter by Category
Find documents by category.
SELECT
title,
description,
page_count,
author
FROM pdf_documents
WHERE category = 'financial'
ORDER BY created_date DESC;
Step 5: Search by Title
Search documents by title keyword.
SELECT
title,
category,
description,
uploaded_by
FROM pdf_documents
WHERE title LIKE '%Report%' OR title LIKE '%Manual%'
ORDER BY title;
Step 6: Document Statistics
Calculate document library statistics.
SELECT
COUNT(*) as total_documents,
SUM(page_count) as total_pages,
SUM(file_size) / 1048576 as total_size_mb,
AVG(page_count) as avg_pages
FROM pdf_documents
WHERE status = 'active';
Step 7: Documents by Category
Group documents by category.
SELECT
category,
COUNT(*) as doc_count,
SUM(page_count) as total_pages,
SUM(file_size) / 1048576 as total_mb
FROM pdf_documents
GROUP BY category
ORDER BY doc_count DESC;
Step 8: Recent Documents
Get recently uploaded documents.
SELECT
title,
category,
uploaded_by,
created_at
FROM pdf_documents
ORDER BY created_at DESC
LIMIT 5;
Step 9: Large Documents
Find documents over a certain size.
SELECT
title,
page_count,
file_size / 1024 as size_kb,
category
FROM pdf_documents
WHERE file_size > 1048576
ORDER BY file_size DESC;
Step 10: Public Documents
Get documents marked as public.
SELECT
title,
description,
author,
created_date
FROM pdf_documents
WHERE is_public = TRUE
ORDER BY title;
Step 11: Update Document Metadata
Update document information.
UPDATE pdf_documents
SET description = 'Updated: Company annual financial report with Q4 results',
version = 2
WHERE id = 1;
SELECT title, description, version FROM pdf_documents WHERE id = 1;
Step 12: Documents by Author
Find documents by specific author.
SELECT
author,
COUNT(*) as document_count,
SUM(page_count) as total_pages
FROM pdf_documents
GROUP BY author
ORDER BY document_count DESC;
Cleanup (Optional)
DROP TABLE IF EXISTS pdf_documents;
Expected Outcomes
- PDFs stored with metadata
- Category filtering works
- Search returns matches
- Statistics calculated
- Size queries function
Document Categories
| Category | Purpose |
|---|---|
| financial | Financial reports |
| hr | Human resources |
| technical | Technical docs |
| marketing | Marketing materials |
| legal | Legal documents |
| training | Training materials |
Key Concepts Learned
- PDF data type usage
- Document metadata storage
- Category organization
- Size and page tracking
- Basic document queries