Embeddings API - NagaAI Documentation

Use the Embeddings API when you need numeric vector representations of text for semantic search, retrieval, ranking, recommendation, or clustering. If you need generated text, tools, or multimodal conversation flows, use Responses API instead.

Best Fit

RAG indexing pipelines
semantic search over documents or tickets
clustering or classification workflows

Request Model

Most requests need:

model
input
optionally dimensions
optionally encoding_format

The response contains one or more embedding vectors plus token usage metadata. NagaAI supports both JSON float vectors and base64-encoded vectors for transport efficiency.

Quick Example

from openai import OpenAI

client = OpenAI(base_url="https://api.naga.ac/v1", api_key="YOUR_API_KEY")

response = client.embeddings.create(
    model="text-embedding-3-small",
    input=["vector search", "semantic retrieval"]
)

print(len(response.data[0].embedding))

Response Anatomy

{
  "object": "list",
  "model": "text-embedding-3-small",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.12, -0.03, 0.41]
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "total_tokens": 12
  }
}

The usage object contains prompt_tokens and total_tokens. Because embeddings do not generate new text, there are no completion tokens to bill for.

Common patterns

embed one query string for retrieval at request time
embed batches of document chunks during indexing
lower transport size with encoding_format="base64" when needed
request dimensions only if your selected model supports it and your vector store expects it

What To Learn Next

Reference

Create embeddings

​Best Fit

​Request Model

​Quick Example

​Response Anatomy

​Common patterns

​What To Learn Next

​Related Docs

​Reference