Base URL
Authentication
Every request takes a bearer token in theAuthorization header. Two token types are accepted:
| Token | Format | Lifetime | When to use |
|---|---|---|---|
| API key | lk_... | Long-lived, until revoked or expired | CLI, CI, scripts, integrations |
| JWT | Standard JWT | Short-lived, refreshable via /auth/refresh | Interactive sessions, dashboard, browser-based testing |
Organization context
Every billable resource (runs, VMs, deployments, balance, history) belongs to an organization. The server resolves the org for each request in this order:- API key org, Keys created under
/orgs/{slug}/api-keyscarry the org. No header needed. X-Org-Slugheader, On JWT requests, pin the org by slug. The server verifies your membership.- Default org, If neither is present, the request falls back to the membership flagged as your default (auto-created at signup).
Login (JWT flow)
access_token and refresh_token. Pass the access token as Authorization: Bearer <access_token>. When it expires, call POST /auth/refresh with the refresh token to get a new pair.
Validation errors
Endpoints return HTTP422 with a structured HTTPValidationError body when the request payload is malformed or missing required fields. Other failures return standard HTTP status codes (400, 401, 403, 404, 5xx) with a detail field describing the error.
Endpoint groups
The full endpoint list is in the sidebar under Endpoints, grouped by tag. Highlights:| Group | Purpose | Doc page |
|---|---|---|
| Authentication | /auth/login, /auth/refresh | Quickstart |
| Organizations | /orgs, /orgs/{slug}, members, balance, transactions | Organizations |
| Org API Keys | /orgs/{slug}/api-keys/... | API Keys |
| Org Invites | /orgs/{slug}/invites/..., /invites/{token}, /invites/mine | Organizations |
| MFA | TOTP enrolment, verification, backup codes | Settings |
| Streaming Execution | Submit Python runs, fetch status, abort | Launch a Run |
| Docker Execution | Submit Docker image runs | Launch a Run |
| Docker Compose Execution | Submit Compose stacks | Launch a Run |
| GPU Selection Execution | Fan out across GPU types | Runs |
| Workload Management | List, abort, stop runs | Runs |
| Execution Management | Get/delete a run, fetch timing | Runs |
| Observability - Logs | Loki-backed log queries | Logs |
| Observability - GPU Metrics | DCGM and system metrics per execution | GPU & System Metrics |
| Machine Types | Hardware catalogue and pricing | Launch a Run |
| Pricing | Full price book across all meters (/pricing) | Launch a Run |
| User Quotas | Hardware profiles your account can use | Settings |
| Storage Files / Storage Credentials | Per-user S3 bucket | Storage |
| Environment Variables | Secrets injected into runs | Secrets |
| Dedicated Deployment External | Create, get, list, stop dedicated deployments | Dedicated Inference |
| Streaming Inference | SSE streaming for inference results | Streaming Inference |
| Batch API | OpenAI-compatible files and batches | — |
| Billing | Credits, history, invoices, vouchers | Billing |
| VMs | Provision and manage GPU virtual machines | Your VMs |

