OpenRouter is the hosted multiplexer over hundreds of LLMs from dozens of providers. One API key, one billing relationship, every frontier model — Claude, GPT, Gemini, Llama, Mistral, Qwen, Gemma, DeepSeek, Cohere, Grok, and many more — behind a single OpenAI-compatible API surface. Competitive per-token pricing across providers, transparent routing across upstream availability, real-time quality and price tracking. The pragmatic choice for teams that want frontier model flexibility without managing N billing relationships, and the daily-driver tool for prototyping work where "try this prompt against five different models" is the workflow.
OpenRouter is a hosted SaaS that aggregates LLM access across hundreds of models from dozens of upstream providers into a single API. Founded in 2023, OpenRouter has become one of the most-used aggregation platforms in the LLM ecosystem — particularly in dev / prototyping / hobbyist contexts where the cost of opening accounts at every individual provider would dominate the actual model spend.
The thesis: most teams want to compare or switch between models more often than they want to manage individual provider relationships. Opening accounts at OpenAI, Anthropic, Google, Cohere, Together AI, Groq, Fireworks, Perplexity, X.ai, DeepSeek, Mistral, and a dozen smaller providers is more friction than the work it enables. OpenRouter abstracts the friction: one signup, one payment method, one API key, one usage dashboard, and you're talking to all of them through a single OpenAI-shaped endpoint.
The economics work because OpenRouter takes a margin on the upstream cost. For most users this is invisible — OpenRouter's pricing is competitive with going direct, and for some models OpenRouter's volume aggregation actually unlocks lower prices than individual users would get direct. Free tier exists with rate-limited access to many models including frontier-tier (useful for prototyping). Paid tier is pay-per-token with no monthly minimum. BYOK (Bring Your Own Key) lets you use your own provider account for specific models while still routing through OpenRouter's API surface.
OpenRouter is one layer up from individual hosted providers. The stack: model weights live with model authors (Meta, Anthropic, Google, etc.); inference happens on hosted providers (OpenAI, Anthropic, Bedrock, Together AI, Groq, etc.); aggregators (OpenRouter) sit on top of providers; agent frameworks (LangGraph, ADK, CrewAI) sit on top of either aggregators or direct provider APIs. OpenRouter exists because the layer between "dozens of providers" and "your app" was structurally underserved.
Using OpenRouter from agent code is identical to using OpenAI — the only changes are the base URL and the model string. The model string format is provider/model-name, e.g. anthropic/claude-opus-4.7 or meta-llama/llama-3.3-70b-instruct:
# Use OpenRouter from any OpenAI-SDK-speaking code from openai import OpenAI client = OpenAI( base_url="https://openrouter.ai/api/v1", api_key="sk-or-...", ) # Same code, different upstream — model string changes resp = client.chat.completions.create( model="anthropic/claude-sonnet-4.6", messages=[{"role": "user", "content": "Hi"}], ) # Or: # "openai/gpt-5" # "google/gemini-2.5-pro" # "meta-llama/llama-3.3-70b-instruct" # "qwen/qwen-2.5-72b-instruct" # "deepseek/deepseek-chat"
Provider routing for open-weights models. For a model that multiple upstream providers host (e.g. Llama 70B on Together / Fireworks / DeepInfra / Groq), OpenRouter automatically routes to the cheapest available upstream. You can override with explicit provider preference if you care about specific characteristics (Groq for low latency, Together for throughput):
# Force a specific upstream provider resp = client.chat.completions.create( model="meta-llama/llama-3.3-70b-instruct", messages=msgs, extra_body={ "provider": {"order": ["Groq", "Together"]}, }, )
OpenRouter's site (openrouter.ai/models) shows real-time per-model leaderboards: price ranking, throughput ranking, latency ranking, and most-used-this-week. This is genuinely useful for "which model should I evaluate next" decisions, particularly for teams trying to understand what the broader ecosystem is shipping. The free tier and unified billing make it cheap to actually try the recommendations.
OpenRouter and LiteLLM both abstract over multiple LLM providers, but they make different bets. OpenRouter is hosted SaaS — you sign up, pay per token, get one API. LiteLLM is open-source library / proxy — you embed it or self-host. They're often used together in the same stack.
| Dimension | OpenRouter | LiteLLM |
|---|---|---|
| Form factor | Hosted SaaS | Open-source library or self-hosted proxy |
| Setup cost | Sign up, paste API key | Install package or run proxy |
| Billing model | One bill from OpenRouter (BYOK optional) | You manage all upstream billing |
| Provider coverage | 300+ models, 60+ providers | 100+ providers |
| Cost-routing | Automatic across upstreams | Configure manually with fallback chains |
| Self-hosting | No | Yes (proxy mode) |
| Data residency | Routes through OpenRouter (US-hosted) | Self-host wherever you want |
| Best for | Prototyping, dev, single-bill simplicity, model exploration | Internal AI platforms, central gateway, residency-controlled production |
Many production setups use both. OpenRouter as one upstream behind a self-hosted LiteLLM proxy. The proxy gives you cost tracking, virtual keys, and residency control; OpenRouter gives you broad model coverage as a single upstream. Internal teams hit the proxy; the proxy routes high-volume traffic direct-to-vendor (cheapest per-token), routes long-tail / one-off requests through OpenRouter (lowest setup friction), routes residency-sensitive traffic to specific regional providers.
For SA studios, OpenRouter is genuinely valuable for prototyping and client demos. One $20 USD top-up unlocks dozens of frontier models for evaluation work; the free tier covers low-volume hobbyist use entirely. When pitching agent projects to clients, being able to demo "your agent answer on Claude vs GPT vs Gemini" in 30 seconds — without standing up three separate provider accounts — meaningfully shortens the sales cycle.
OpenRouter routes traffic through their US-hosted infrastructure. For POPIA-sensitive workloads carrying personal information, OpenRouter introduces a Section 72 cross-border transfer that wouldn't otherwise exist if you went direct to a regional provider. For non-PII workloads (research, prototyping, internal tooling without customer data), this is a non-issue. For customer-data-bearing production, prefer regional residency paths: AWS Bedrock af-south-1, Vertex AI africa-south1, Azure OpenAI South Africa North, or self-hosted vLLM.
OpenRouter is USD-billed like everything else. The aggregator margin is small enough that going through OpenRouter doesn't materially change FX cost vs going direct, and the operational simplicity of one bill vs N bills is genuinely worth the small markup for most studios. The pattern that wins for SA studios watching FX: OpenRouter for evaluation and low-volume work, direct vendor accounts for high-volume production traffic on specific models, regional cloud (Bedrock / Vertex) for residency-sensitive workloads.
anthropic/claude-... model strings on OpenRouter. The simplest path to test Claude Opus / Sonnet / Haiku without an Anthropic account.openai/gpt-... strings. Useful when comparing GPT to other models without juggling OpenAI organisation accounts.google/gemini-... strings. Easier than setting up Vertex AI for prototyping work.ChatOpenAI at OpenRouter and your graph runs against any of 300+ models.