Documentation
One endpoint. Every model.
RouterPlex is a drop-in replacement for the OpenAI API. If your code works with OpenAI, it works with RouterPlex — change the base URL and the key, then use any model we serve.
https://api.routerplex.com/v1- Create an account — you get $5 in free credits, no card required.
- In the dashboard, open API Keys and create a key. Copy it immediately — it's shown only once.
- Make your first request:
curl https://api.routerplex.com/v1/chat/completions \-H "Authorization: Bearer $ROUTERPLEX_KEY" \-H "Content-Type: application/json" \-d '{"model": "gpt-5.5","messages": [{"role": "user", "content": "Hello!"}]}'
Swap "gpt-5.5" for any model in the catalog — same endpoint, same request shape.
Every request needs your API key in the Authorization header:
Authorization: Bearer sk-...
Keys are created and managed in the dashboard. Treat them like passwords: server-side only, never in browser code or public repos. If a key leaks, delete it in the dashboard — revocation is immediate.
https://api.routerplex.com/v1/chat/completionsFully OpenAI-compatible, so the official SDKs work as-is.
from openai import OpenAIclient = OpenAI(base_url="https://api.routerplex.com/v1",api_key="sk-...", # your RouterPlex key)response = client.chat.completions.create(model="gemini-3.5-flash",messages=[{"role": "user", "content": "Explain HTTP in one line"}],)print(response.choices[0].message.content)
import OpenAI from "openai";const client = new OpenAI({baseURL: "https://api.routerplex.com/v1",apiKey: process.env.ROUTERPLEX_KEY,});const response = await client.chat.completions.create({model: "deepseek-v4-pro",messages: [{ role: "user", content: "Explain HTTP in one line" }],});console.log(response.choices[0].message.content);
Function calling, tool use, JSON mode, and vision inputs work the same way as with OpenAI — pass tools, response_format, or image content parts as usual.
Set stream: true to receive tokens as server-sent events, exactly like the OpenAI API:
stream = client.chat.completions.create(model="gpt-5.5",messages=[{"role": "user", "content": "Write a haiku"}],stream=True,)for chunk in stream:delta = chunk.choices[0].delta.contentif delta:print(delta, end="", flush=True)
Streams stay open for up to 10 minutes — enough for long reasoning-model outputs.
https://api.routerplex.com/v1/images/generationsOpenAI-compatible image generation. Images are returned base64-encoded and billed by token (prompt text in, image tokens out).
from openai import OpenAIimport base64client = OpenAI(base_url="https://api.routerplex.com/v1", api_key="sk-...")result = client.images.generate(model="gpt-image-2",prompt="A lighthouse on a cliff at dusk, watercolor",size="1024x1024",quality="medium", # low | medium | high)with open("lighthouse.png", "wb") as f:f.write(base64.b64decode(result.data[0].b64_json))
Sizes: 1024x1024, 1536x1024, 1024x1536. Higher quality uses more output tokens. You can try it without code in the playground.
https://api.routerplex.com/v1/modelsList the models your key can use (also OpenAI-compatible):
curl https://api.routerplex.com/v1/models \-H "Authorization: Bearer $ROUTERPLEX_KEY"
The live catalog shows current per-token pricing and context windows. Model IDs are used verbatim in the model field — e.g. gpt-5.5, gemini-3.1-pro, kimi-k2.7.
Anything that speaks the OpenAI API can use RouterPlex: point it at https://api.routerplex.com/v1 with your key, and use any model ID from the catalog. Setup for the major coding tools:
Cursor
editorCursor Settings → Models → API Keys: paste your RouterPlex key into the OpenAI API Key field, enable Override OpenAI Base URL, and set it to:
https://api.routerplex.com/v1
Then click Add model and enter a model ID from the catalog verbatim (e.g. claude-opus-4-8, gpt-5.5). Enable only your custom models, and use them from the model picker in chat.
VS Code — Cline
extensionIn Cline's settings choose the OpenAI Compatible API provider:
Base URL: https://api.routerplex.com/v1API Key: sk-... # your RouterPlex keyModel ID: claude-opus-4-8
VS Code — Roo Code
extensionIn Roo Code's settings pick the OpenAI Compatible API provider — same shape as Cline:
Base URL: https://api.routerplex.com/v1API Key: sk-... # your RouterPlex keyModel ID: claude-opus-4-8
You can set a different model per mode (Code, Architect, Ask) — e.g. a cheap model like claude-haiku-4-5 for Ask.
Zed
editorIn settings.json point the OpenAI provider at RouterPlex and declare the models you want in the picker:
{"language_models": {"openai": {"api_url": "https://api.routerplex.com/v1","available_models": [{ "name": "claude-opus-4-8", "display_name": "Claude Opus 4.8 (RouterPlex)", "max_tokens": 1000000 },{ "name": "deepseek-v4-pro", "display_name": "DeepSeek V4 Pro (RouterPlex)", "max_tokens": 1000000 }]}}}
Then open the Agent Panel settings and paste your RouterPlex key as the OpenAI API key.
OpenCode
cliAdd RouterPlex as a custom provider in ~/.config/opencode/opencode.json (or a per-project opencode.json):
{"$schema": "https://opencode.ai/config.json","provider": {"routerplex": {"npm": "@ai-sdk/openai-compatible","name": "RouterPlex","options": {"baseURL": "https://api.routerplex.com/v1","apiKey": "{env:ROUTERPLEX_API_KEY}"},"models": {"claude-opus-4-8": { "name": "Claude Opus 4.8" },"gpt-5.5": { "name": "GPT-5.5" }}}}}
Then pick the model inside OpenCode with /models.
Codex CLI
cliAdd RouterPlex as a provider in ~/.codex/config.toml:
model = "gpt-5.5"model_provider = "routerplex"[model_providers.routerplex]name = "RouterPlex"base_url = "https://api.routerplex.com/v1"env_key = "ROUTERPLEX_API_KEY" # export ROUTERPLEX_API_KEY=sk-...
Claude Code talks to Anthropic's native Messages API rather than the OpenAI format, which RouterPlex doesn't expose yet — so it can't point at us today. Every tool above uses the OpenAI-compatible endpoint and works with all 25+ models, Claude included.
A key used inside an editor is still just a RouterPlex key — give it its own budget and model allowlist so an agent gone wild can't drain your balance.
Each key can have its own guardrails, set at creation or later from the dashboard:
- Budget — a hard spend cap for the key. Requests fail once it's reached.
- Allowed models — restrict a key to specific models (e.g. only cheap ones for a side project).
- Rate limits — optional requests-per-minute and tokens-per-minute caps.
Per-request logs (time, model, tokens, cost) are available per key in the dashboard under API Keys → Logs.
Standard OpenAI-style error responses:
401Missing or invalid API keyno — fix the key400Malformed request, or budget exceeded (top up to continue)no429Rate limit hityes, with backoff5xxUpstream provider issueyesRequests are limited to 60/s per IP at the edge, and request bodies up to 50 MB (plenty for base64 vision payloads).
Pure pay-as-you-go: you top up a balance (card or crypto), and every request deducts its exact token cost — the same prices shown in the catalog, no subscriptions, no minimum spend. When your balance runs out, requests stop; they resume the moment you top up.
Your live balance, total spend, and per-model breakdown are on the dashboard.
Ready to build?
$5 in free credits. No card required.