Chat Completions API
Build agentic applications with Meta LLAMA and Anthropic models. Full tool calling support.
Endpoint: POST /api/v1/public/chat/completions
Basic Request
curl -X POST https://api.ainative.studio/api/v1/public/chat/completions \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "meta-llama/llama-3.3-70b-instruct",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is ZeroDB?"}
],
"temperature": 0.7,
"max_tokens": 1024
}'
Python
import requests
response = requests.post(
"https://api.ainative.studio/api/v1/public/chat/completions",
headers={"Authorization": f"Bearer {TOKEN}"},
json={
"model": "meta-llama/llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": False,
},
)
print(response.json()["choices"][0]["message"]["content"])
Streaming
curl -X POST https://api.ainative.studio/api/v1/public/chat/completions \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "meta-llama/llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "Write a poem"}],
"stream": true
}'
Streaming returns Server-Sent Events (SSE) with data: prefix on each chunk.
Available Models
| Model | ID | Context | Best For |
|---|---|---|---|
| Llama 3.3 70B | meta-llama/llama-3.3-70b-instruct | 128K | General, tool calling |
| Llama 3.1 8B | meta-llama/llama-3.1-8b-instruct | 128K | Fast, lightweight |
Tool Calling
{
"model": "meta-llama/llama-3.3-70b-instruct",
"messages": [{"role": "user", "content": "What's the weather in SF?"}],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string"}
},
"required": ["city"]
}
}
}
]
}
Request Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
model | string | required | Model ID |
messages | array | required | Conversation messages |
temperature | number | 0.7 | 0.0-2.0 randomness |
max_tokens | number | 4096 | Max response tokens |
stream | boolean | false | Enable streaming |
tools | array | null | Tool definitions |
top_p | number | 1.0 | Nucleus sampling |
Rate Limits
| Tier | Requests/min | Tokens/month |
|---|---|---|
| Free | 60 | 10K |
| Pro | 300 | 1M |
| Business | 1000 | 5M |
| Enterprise | 5000 | 10M |
Next Steps
- Authentication — API keys and JWT tokens
- SDKs — Client libraries for React, Next.js, Python