Deploy
CLI
REST API
hf_token in the request body or pass --hf-token to the CLI.
Wait for it to be healthy
Call the deployment
id, choices, usage, model, created.
From the OpenAI Python SDK
Because the endpoint is OpenAI-compatible, you can point the officialopenai client at it:

