Skip to main content
Smart Routing selects the best model for each prompt automatically. Simple questions go to a fast, cost-efficient model; complex or multi-step tasks go to a more capable one. You use the same OpenAI-compatible API — just change the model name.

Models

ModelWhat it does
lyceum/routerClassifies each prompt and routes to the optimal model automatically
lyceum/simpleAlways routes to a fast, cost-efficient model
lyceum/complexAlways routes to a high-capability model
lyceum/reasoningAlways routes to a reasoning model
Use lyceum/router when you want automatic cost/quality optimisation. Use the others when you want explicit control over which tier handles your requests.

Usage

The API is identical to any other serverless inference call — only the model name changes.
from openai import OpenAI

client = OpenAI(
    api_key="lk_...",
    base_url="https://api.lyceum.technology/api/v2/external",
)

resp = client.chat.completions.create(
    model="lyceum/router",
    messages=[{"role": "user", "content": "What is the capital of France?"}],
)
print(resp.choices[0].message.content)
Streaming works the same way:
stream = client.chat.completions.create(
    model="lyceum/router",
    messages=[{"role": "user", "content": "Explain gradient descent step by step."}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

Billing

Requests are billed at the rate of whichever model the router selects. lyceum/simple, lyceum/complex, and lyceum/reasoning always route to the same tier, so their cost is predictable.

Serverless Inference

Learn more about pay-per-request inference.