AI Instructor Live Labs Included

OpenAI: Embeddings & Retrieval Systems (RAG)

Build production RAG pipelines with OpenAI embeddings, pgvector, hybrid BM25+vector search, and the native File Search API.

Intermediate

10h 20m

10 Lessons

OPENAI-202

View badge details

About This Course

Build production-grade RAG pipelines using OpenAI embeddings, pgvector, hybrid search, and the native File Search API. Learn to compute semantic similarity with text-embedding-3-small, store and query vectors at scale with pgvector, combine BM25 keyword search with vector search using Reciprocal Rank Fusion, and build an enterprise RAG assistant with hallucination detection and source attribution.

Course Curriculum

10 Lessons

AI Lesson

Text Embeddings & Semantic Similarity

30m

Learn what embeddings are and how they represent meaning as vectors. Covers text-embedding-3-small vs text-embedding-3-large, the dimensions parameter for cost optimization, cosine similarity computation, and when semantic search outperforms keyword search.

Lab Exercise

Semantic Similarity Search Engine - Lab Exercises

1h 25m 2 Exercises

Build a semantic search engine over a 20-product catalog. Implement get_embedding() with text-embedding-3-small, cosine_similarity() with numpy, build_product_index() to embed all products, and semantic_search() to return the top-k most similar products for any natural language query.

Get Embedding Vectors Implement get_embedding() to call client.embeddings.create() with text-embedding-3-small and return the embedding vector from response.data[0].embedding. ~15 min

Cosine Similarity & Semantic Search Implement cosine_similarity() using numpy dot product and norm division, then implement semantic_search() to embed the query, compute similarity against all indexed products, sort descending, and return the top_k results with scores. ~25 min

AI Lesson

Vector Storage with pgvector

30m

Learn to set up pgvector in Python with psycopg2, create embeddings tables with vector columns, index vectors with HNSW and IVFFlat, and run nearest-neighbor queries using the cosine distance operator.

Lab Exercise

Persistent Embedding Store - Lab Exercises

1h 45m 3 Exercises

Build a persistent embedding store using pgvector. Set up the database schema with HNSW index, chunk and embed a documentation corpus, store embeddings with psycopg2, and query with the cosine distance operator to retrieve semantically similar chunks.

Set Up the Database Schema Implement setup_database() to enable the pgvector extension, create the embeddings table with a vector(1536) column, and create an HNSW index on the embedding column using vector_cosine_ops. ~20 min

Chunk, Embed & Index Documents Implement index_documents() to chunk each document's content using chunk_text(), embed each chunk with embed_text(), and INSERT the title, chunk text, and embedding vector into the pgvector embeddings table. ~20 min

Nearest-Neighbor Query Implement semantic_search() to embed the query, execute a nearest-neighbor SELECT using the <=> cosine distance operator, and return the top_k results as dicts with title, chunk_text, and similarity score (1 - cosine_distance). ~20 min

AI Lesson

Hybrid Search & Re-Ranking

30m

Learn why vector search alone misses exact keyword matches, how to combine BM25 with vector search using Reciprocal Rank Fusion, cross-encoder re-ranking, retrieval compression, and how to measure retrieval quality with precision@k, recall@k, and MRR.

Lab Exercise

Hybrid Search Pipeline - Lab Exercises

1h 30m 2 Exercises

Build a hybrid search pipeline combining BM25 keyword search with vector similarity search, fused using Reciprocal Rank Fusion. Implement bm25_search(), vector_search(), reciprocal_rank_fusion(), and the full hybrid_search() pipeline.

BM25 Keyword Search Implement bm25_search() to tokenize the query, get BM25 scores from the pre-built index, sort by score descending with numpy argsort, and return the top_k results as (doc_id, score) tuples. ~15 min

Vector Search & RRF Fusion Implement vector_search() for cosine similarity ranking, then reciprocal_rank_fusion() to combine BM25 and vector results using 1/(60+rank) scoring, and finally hybrid_search() to run the complete pipeline and return the top_k fused results. ~30 min

AI Lesson

OpenAI File Search & Vector Stores API

30m

Learn OpenAI's native Vector Stores API — when to use it vs self-hosted pgvector, how to create vector stores, upload files with automatic chunking, query using the Responses API file_search tool, manage vector store lifecycle, and understand the cost model.

Lab Exercise

File Search Integration - Lab Exercises

1h 25m 2 Exercises

Build a document Q&A assistant using the OpenAI Responses API file_search tool. Create a vector store, upload technical documents, and query them using the built-in file_search tool — no external vector database required.

Create Vector Store & Upload Documents Implement create_vector_store() to create an OpenAI vector store and return its ID, then implement upload_documents() to upload each document as a file using io.BytesIO and client.vector_stores.files.upload_and_poll(). ~20 min

File Search Q&A Assistant Implement ask_with_file_search() to call client.responses.create() with the file_search built-in tool pointing at the vector store, and return the response text. The model automatically retrieves relevant document chunks without manual retrieval code. ~20 min

AI Lesson

Capstone Briefing: Enterprise RAG Assistant

20m

Reviews all Course 202 concepts: embeddings, pgvector, hybrid search, re-ranking, and file search. Previews the capstone architecture — a hybrid RAG assistant with source attribution, hallucination detection, and retrieval quality evaluation.

Lab Exercise

Capstone Project: Enterprise RAG Assistant - Lab Exercises

1h 55m 3 Exercises

Build a complete enterprise RAG assistant with hybrid BM25+vector retrieval using RRF, grounded answer generation with source attribution, a hallucination detection guard using a second LLM call, and precision@3 evaluation across test queries.

Hybrid Retrieval with RRF Implement hybrid_retrieve() to embed the query, run BM25 and vector searches in parallel, fuse with Reciprocal Rank Fusion (k=60), and return the top_k knowledge base chunks with their rrf_score. ~25 min

Grounded Answer Generation Implement generate_answer() to build a context string from retrieved chunks (each prefixed with its source name in brackets), then call client.responses.create() with a system prompt instructing the model to answer only from context and cite sources. ~20 min

Hallucination Guard & Retrieval Evaluation Implement check_groundedness() using client.responses.parse() with GroundednessCheck Pydantic model to verify answers are grounded in the retrieved context, then implement evaluate_retrieval() to compute precision@3 across TEST_QUERIES by comparing retrieved chunk IDs against relevant_ids. ~25 min

This course includes:

24/7 AI Instructor Support
Live Lab Environments
5 Hands-on Lessons
6 Months Access
Completion Badge
Certificate of Completion

Earn Your Badge

Complete all lessons to unlock the OpenAI RAG Developer achievement badge.

OpenAI: Embeddings & Retrieval Systems (RAG)

About This Course

Course Curriculum

Text Embeddings & Semantic Similarity

Semantic Similarity Search Engine - Lab Exercises

Vector Storage with pgvector

Persistent Embedding Store - Lab Exercises

Hybrid Search & Re-Ranking

Hybrid Search Pipeline - Lab Exercises

OpenAI File Search & Vector Stores API

File Search Integration - Lab Exercises

Capstone Briefing: Enterprise RAG Assistant

Capstone Project: Enterprise RAG Assistant - Lab Exercises

This course includes:

Earn Your Badge

OpenAI RAG Developer

Skills You'll Earn