Skip to main content
GET
/
api
/
v2
/
external
/
inference
/
{deployment_id}
/
metrics
Get deployment metrics
curl --request GET \
  --url https://api.example.com/api/v2/external/inference/{deployment_id}/metrics
{
  "deploymentId": "<string>",
  "timeRange": {
    "start": "<string>",
    "end": "<string>"
  },
  "replicaHealth": [
    {
      "replicaId": "<string>",
      "values": [
        {
          "timestamp": "<string>",
          "value": 123
        }
      ]
    }
  ],
  "requestsPerSecond": [
    {
      "timestamp": "<string>",
      "value": 123
    }
  ],
  "p50LatencyMs": [
    {
      "timestamp": "<string>",
      "value": 123
    }
  ],
  "p95LatencyMs": [
    {
      "timestamp": "<string>",
      "value": 123
    }
  ]
}

Path Parameters

deployment_id
string
required

Query Parameters

start
string | null

ISO 8601 start timestamp (defaults to 1h ago)

end
string | null

ISO 8601 end timestamp (defaults to now)

step
string
default:15s

Query resolution (e.g., '15s', '1m')

Response

Successful Response

deploymentId
string
required
timeRange
TimeRange · object
required

Time range for the metrics query.

replicaHealth
ReplicaHealthSeries · object[]
required
requestsPerSecond
MetricDataPoint · object[]
required
p50LatencyMs
MetricDataPoint · object[]
required
p95LatencyMs
MetricDataPoint · object[]
required