Routerly v0.1.5: First Public Release

Carlo Satta March 27, 2026 4 min read

Today we publish the first tagged release of Routerly: v0.1.5. This is the baseline from which we will version the project going forward.

What ships in v0.1.5

Wire-compatible proxy for OpenAI and Anthropic

Routerly is a drop-in replacement for both the OpenAI and Anthropic APIs. Every SDK, tool, and framework that speaks either protocol works without modification: Python, Node.js, .NET, Go, Cursor, Continue.dev, LibreChat, LangChain, LlamaIndex, Open WebUI, and anything that can make an HTTP request.

9 routing policies

Each request is scored against up to 9 pluggable policies applied simultaneously:

Policy	Behaviour
`llm`	Delegates the routing decision to a language model
`cheapest`	Minimises cost per token
`health`	Deprioritises models with recent errors
`performance`	Favours models with lower average latency
`capability`	Matches models to task requirements
`context`	Filters by context window relative to the prompt
`budget-remaining`	Excludes models that would breach a project budget
`rate-limit`	Steers away from rate-limited endpoints
`fairness`	Balances load evenly across candidates

Routerly picks the top-scoring candidate and falls back automatically if that provider fails.

Real-time cost tracking and budgets

Every request is priced at the token level using per-model pricing. Costs accumulate per project, and hard spending limits (hourly, daily, weekly, monthly, or per-request) block overspending before it happens.

Multi-tenant project isolation

Each project gets its own Bearer token, its own model pool, its own routing configuration, and its own budget envelope. Ideal for multi-tenant SaaS products, or for keeping dev, staging, and production traffic cleanly separated.

Streaming and tool calling

Full streaming SSE support and OpenAI-compatible function/tool calling pass-through, including streaming tool call deltas.

Supported providers

OpenAI, Anthropic, Google Gemini, Ollama (local), Mistral, Cohere, xAI (Grok), and any custom HTTP endpoint. Cloud and local models can be mixed in the same project.

Built-in web dashboard

A React dashboard runs as part of the service:

Overview: live spend, call volume, success rate, daily cost trend
Models: manage registered models with provider badges and pricing data
Projects: configure model pools, routing policies, and budget envelopes
Usage analytics: per-model breakdown of calls, tokens, errors, and cost; filterable by period, project, and status
Users and Roles: RBAC with configurable roles and permissions
Light and dark theme

Admin CLI

The routerly CLI covers the full admin surface:

routerly status          # server health and uptime
routerly model add       # register a model
routerly project add     # create a project
routerly user add        # add a dashboard user
routerly role add        # define custom roles
routerly report usage    # pull usage reports
routerly service start   # start the local service

One-line installer

# macOS / Linux
curl -fsSL https://github.com/Inebrio/Routerly/releases/latest/download/install.sh | bash

# Windows (PowerShell)
powershell -c "irm https://www.routerly.ai/install.ps1 | iex"

The installer detects the platform, optionally installs Node.js 20+, builds the packages, generates an encryption key, sets up an auto-start daemon, and walks through first-run setup. On existing installs it presents an Update / Reinstall / Uninstall menu.

A Docker Compose image is also available for zero-dependency deployment (see the Docker Hub post).

Design principles

No external database: config and usage data live in plain JSON files
Fully decoupled: service, dashboard, and CLI can run on different machines
No telemetry: prompts and API keys never leave your infrastructure
Zero migration cost: change one environment variable, nothing else

Getting started

Full documentation is at docs.routerly.ai. Source code and issue tracker at github.com/Inebrio/Routerly.

Sources

Release notes: github.com/Inebrio/Routerly/releases/tag/v0.1.5

All articles