Routerly v0.1.5: First Public Release
Today we publish the first tagged release of Routerly: v0.1.5. This is the baseline from which we will version the project going forward.
What ships in v0.1.5
Wire-compatible proxy for OpenAI and Anthropic
Routerly is a drop-in replacement for both the OpenAI and Anthropic APIs. Every SDK, tool, and framework that speaks either protocol works without modification: Python, Node.js, .NET, Go, Cursor, Continue.dev, LibreChat, LangChain, LlamaIndex, Open WebUI, and anything that can make an HTTP request.
9 routing policies
Each request is scored against up to 9 pluggable policies applied simultaneously:
| Policy | Behaviour |
|---|---|
llm | Delegates the routing decision to a language model |
cheapest | Minimises cost per token |
health | Deprioritises models with recent errors |
performance | Favours models with lower average latency |
capability | Matches models to task requirements |
context | Filters by context window relative to the prompt |
budget-remaining | Excludes models that would breach a project budget |
rate-limit | Steers away from rate-limited endpoints |
fairness | Balances load evenly across candidates |
Routerly picks the top-scoring candidate and falls back automatically if that provider fails.
Real-time cost tracking and budgets
Every request is priced at the token level using per-model pricing. Costs accumulate per project, and hard spending limits (hourly, daily, weekly, monthly, or per-request) block overspending before it happens.
Multi-tenant project isolation
Each project gets its own Bearer token, its own model pool, its own routing configuration, and its own budget envelope. Ideal for multi-tenant SaaS products, or for keeping dev, staging, and production traffic cleanly separated.
Streaming and tool calling
Full streaming SSE support and OpenAI-compatible function/tool calling pass-through, including streaming tool call deltas.
Supported providers
OpenAI, Anthropic, Google Gemini, Ollama (local), Mistral, Cohere, xAI (Grok), and any custom HTTP endpoint. Cloud and local models can be mixed in the same project.
Built-in web dashboard
A React dashboard runs as part of the service:
- Overview: live spend, call volume, success rate, daily cost trend
- Models: manage registered models with provider badges and pricing data
- Projects: configure model pools, routing policies, and budget envelopes
- Usage analytics: per-model breakdown of calls, tokens, errors, and cost; filterable by period, project, and status
- Users and Roles: RBAC with configurable roles and permissions
- Light and dark theme
Admin CLI
The routerly CLI covers the full admin surface:
routerly status # server health and uptime
routerly model add # register a model
routerly project add # create a project
routerly user add # add a dashboard user
routerly role add # define custom roles
routerly report usage # pull usage reports
routerly service start # start the local service
One-line installer
# macOS / Linux
curl -fsSL https://github.com/Inebrio/Routerly/releases/latest/download/install.sh | bash
# Windows (PowerShell)
powershell -c "irm https://www.routerly.ai/install.ps1 | iex"
The installer detects the platform, optionally installs Node.js 20+, builds the packages, generates an encryption key, sets up an auto-start daemon, and walks through first-run setup. On existing installs it presents an Update / Reinstall / Uninstall menu.
A Docker Compose image is also available for zero-dependency deployment (see the Docker Hub post).
Design principles
- No external database: config and usage data live in plain JSON files
- Fully decoupled: service, dashboard, and CLI can run on different machines
- No telemetry: prompts and API keys never leave your infrastructure
- Zero migration cost: change one environment variable, nothing else
Getting started
Full documentation is at docs.routerly.ai. Source code and issue tracker at github.com/Inebrio/Routerly.
Sources
- Release notes: github.com/Inebrio/Routerly/releases/tag/v0.1.5