Retrieval Patterns - NagaAI Documentation

Use the Embeddings API when you want retrieval, semantic search, ranking, or a RAG pipeline.

Typical retrieval flow

Chunk the source material

Split documents into chunks that are small enough for retrieval but still semantically coherent.

Embed the chunks

Generate vectors for each chunk using one embedding model.

Store vectors and metadata

Keep the embeddings together with metadata such as document ID, title, or section so you can filter and cite later.

Embed the user query

Use the same embedding model for the query that you used for the indexed chunks.

Retrieve nearest chunks

Search for the closest matches in vector space.

Filter or rerank if needed

Narrow the candidate set before generation when your pipeline needs better precision.

Pass the final context into generation

Send the selected context to your generation step, usually through Responses API.

Use Responses API for the generation step and Embeddings API for the retrieval step.