Name: RAG and Vector Search
Price: 79 USD
Availability: InStock

Question 1

What is a vector embedding and why do RAG systems need them?

Accepted Answer

A vector embedding is a mathematical representation of text as a list of numbers that captures semantic meaning — words and sentences with similar meanings produce similar vectors. RAG systems use embeddings to find documents semantically relevant to a query, even if they do not share exact keywords. You embed both documents and the query, then find documents whose vectors are closest to the query vector — that is semantic search.

Question 2

What vector database does this course use?

Accepted Answer

This course uses Supabase pgvector, a PostgreSQL extension that adds vector storage and similarity search. Pgvector is an excellent production choice because it runs inside your existing PostgreSQL database, requires no additional infrastructure, and supports hybrid search combining vector similarity with traditional SQL filters. Alternatives like Pinecone, Weaviate, and Qdrant are also discussed.

Question 3

What types of documents can I use with RAG?

Accepted Answer

RAG works with any text-based content: PDFs, Word documents, Markdown files, web pages, database records, code files, and support documentation. The ingestion pipeline extracts text, chunks it into segments, generates embeddings for each chunk, and stores them with metadata. This course covers PDF and Markdown ingestion, with patterns that extend to other content types.

Question 4

How is RAG different from just putting all my documents in the prompt?

Accepted Answer

Context windows limit how much text you can include in a single prompt — even large-context models have limits, and larger prompts cost significantly more to process. RAG retrieves only the most relevant 3–5 document chunks for each query. This makes responses both cheaper and more accurate, because a focused excerpt is easier for the model to reason about than an overwhelming mass of potentially irrelevant text.

Question 5

What is chunking and how should I split my documents?

Accepted Answer

Chunking splits documents into smaller segments before embedding. Chunk size is a key parameter — chunks that are too small lose context, chunks that are too large become unfocused and expensive. Most RAG systems use chunks of 500–1000 tokens with 100–200 token overlaps between adjacent chunks to preserve continuity. This course covers chunking strategies and explains how to tune them for different content types.

RAG and Vector Search

What you'll learn

Course outline

Get the full course

About this course

Frequently asked questions