Skip to main content
Use the Embeddings API when you need numeric vector representations of text for semantic search, retrieval, ranking, recommendation, or clustering. If you need generated text, tools, or multimodal conversation flows, use Responses API instead.

Best Fit

  • RAG indexing pipelines
  • semantic search over documents or tickets
  • clustering or classification workflows

Request Model

Most requests need:
  • model
  • input
  • optionally dimensions
  • optionally encoding_format
The response contains one or more embedding vectors plus token usage metadata. NagaAI supports both JSON float vectors and base64-encoded vectors for transport efficiency.

Quick Example

from openai import OpenAI

client = OpenAI(base_url="https://api.naga.ac/v1", api_key="YOUR_API_KEY")

response = client.embeddings.create(
    model="text-embedding-3-small",
    input=["vector search", "semantic retrieval"]
)

print(len(response.data[0].embedding))

Response Anatomy

{
  "object": "list",
  "model": "text-embedding-3-small",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.12, -0.03, 0.41]
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "total_tokens": 12
  }
}
The usage object contains prompt_tokens and total_tokens. Because embeddings do not generate new text, there are no completion tokens to bill for.

Common patterns

  • embed one query string for retrieval at request time
  • embed batches of document chunks during indexing
  • lower transport size with encoding_format="base64" when needed
  • request dimensions only if your selected model supports it and your vector store expects it

What To Learn Next

Reference