Skip to main content

Chat Completions API

Build agentic applications with Meta LLAMA and Anthropic models. Full tool calling support.

Endpoint: POST /api/v1/public/chat/completions

Basic Request

curl -X POST https://api.ainative.studio/api/v1/public/chat/completions \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "meta-llama/llama-3.3-70b-instruct",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is ZeroDB?"}
],
"temperature": 0.7,
"max_tokens": 1024
}'

Python

import requests

response = requests.post(
"https://api.ainative.studio/api/v1/public/chat/completions",
headers={"Authorization": f"Bearer {TOKEN}"},
json={
"model": "meta-llama/llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": False,
},
)
print(response.json()["choices"][0]["message"]["content"])

Streaming

curl -X POST https://api.ainative.studio/api/v1/public/chat/completions \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "meta-llama/llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "Write a poem"}],
"stream": true
}'

Streaming returns Server-Sent Events (SSE) with data: prefix on each chunk.

Available Models

ModelIDContextBest For
Llama 3.3 70Bmeta-llama/llama-3.3-70b-instruct128KGeneral, tool calling
Llama 3.1 8Bmeta-llama/llama-3.1-8b-instruct128KFast, lightweight

Tool Calling

{
"model": "meta-llama/llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "What's the weather in SF?"}],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string"}
},
"required": ["city"]
}
}
}
]
}

Request Parameters

ParameterTypeDefaultDescription
modelstringrequiredModel ID
messagesarrayrequiredConversation messages
temperaturenumber0.70.0-2.0 randomness
max_tokensnumber4096Max response tokens
streambooleanfalseEnable streaming
toolsarraynullTool definitions
top_pnumber1.0Nucleus sampling

Rate Limits

TierRequests/minTokens/month
Free6010K
Pro3001M
Business10005M
Enterprise500010M

Next Steps

  • Authentication — API keys and JWT tokens
  • SDKs — Client libraries for React, Next.js, Python