Agent Deployments
Deploy agent containers on AINative's managed Kubernetes cluster. Includes auto-scaling, health monitoring, and integrated ZeroDB storage.
Base path: /api/v1/cloud/deployments
POST /
Deploy an agent container. Requires a registered agent and a container image URI.
import requests
response = requests.post(
"https://api.ainative.studio/api/v1/cloud/deployments",
headers=HEADERS,
json={
"agent_registration_id": "550e8400-e29b-41d4-a716-446655440000",
"image_uri": "ghcr.io/my-org/my-agent:latest",
"resource_plan": "standard",
"runtime_config": {
"env": {
"LOG_LEVEL": "info",
"ZERODB_PROJECT": "my-project",
}
},
},
)
deployment = response.json()
print(f"Deployed: {deployment['endpoint_url']}")
Response (201):
{
"id": "dep-uuid",
"agent_registration_id": "550e8400-...",
"namespace": "agents-user-123",
"endpoint_url": "https://agents.ainative.studio/dep-uuid",
"image_uri": "ghcr.io/my-org/my-agent:latest",
"resource_plan": "standard",
"status": "deploying",
"min_instances": 1,
"max_instances": 3,
"auto_scale_enabled": true,
"health_status": "unknown"
}
GET /
List your agent deployments.
curl "https://api.ainative.studio/api/v1/cloud/deployments?status=running" \
-H "Authorization: Bearer $TOKEN"
GET /{deployment_id}
Get deployment details including status, endpoint URL, and scaling configuration.
curl https://api.ainative.studio/api/v1/cloud/deployments/dep-uuid \
-H "Authorization: Bearer $TOKEN"
POST /{deployment_id}/scale
Scale deployment replicas via Horizontal Pod Autoscaler.
requests.post(
"https://api.ainative.studio/api/v1/cloud/deployments/dep-uuid/scale",
headers=HEADERS,
json={
"min_instances": 2,
"max_instances": 10,
},
)
Constraints: min_instances 0–10, max_instances 1–50.
DELETE /{deployment_id}
Teardown and terminate a deployment.
curl -X DELETE https://api.ainative.studio/api/v1/cloud/deployments/dep-uuid \
-H "Authorization: Bearer $TOKEN"
GET /{deployment_id}/logs
Retrieve recent logs from a running agent.
curl "https://api.ainative.studio/api/v1/cloud/deployments/dep-uuid/logs?lines=100" \
-H "Authorization: Bearer $TOKEN"
| Parameter | Type | Default | Description |
|---|---|---|---|
lines | int | 50 | Number of log lines to return |
since | datetime | — | Return logs after this timestamp |
Resource Plans
| Plan | vCPU | Memory | GPU | Scaling | Use Case |
|---|---|---|---|---|---|
basic | 0.5 | 512 MB | — | 1 instance | Development, testing |
standard | 1 | 1 GB | — | 1–10 | Production workloads |
performance | 2 | 2 GB | — | 1–20 | High-throughput agents |
gpu | 2 | 4 GB | T4 | 1–5 | ML inference agents |
Deployment Lifecycle
deploying → running → scaling → running
↓
stopping → stopped
↓
failed