Routerly v0.1.5: First Public Release

Carlo Satta 4 min read

Today we publish the first tagged release of Routerly: v0.1.5. This is the baseline from which we will version the project going forward.

What ships in v0.1.5

Wire-compatible proxy for OpenAI and Anthropic

Routerly is a drop-in replacement for both the OpenAI and Anthropic APIs. Every SDK, tool, and framework that speaks either protocol works without modification: Python, Node.js, .NET, Go, Cursor, Continue.dev, LibreChat, LangChain, LlamaIndex, Open WebUI, and anything that can make an HTTP request.

9 routing policies

Each request is scored against up to 9 pluggable policies applied simultaneously:

PolicyBehaviour
llmDelegates the routing decision to a language model
cheapestMinimises cost per token
healthDeprioritises models with recent errors
performanceFavours models with lower average latency
capabilityMatches models to task requirements
contextFilters by context window relative to the prompt
budget-remainingExcludes models that would breach a project budget
rate-limitSteers away from rate-limited endpoints
fairnessBalances load evenly across candidates

Routerly picks the top-scoring candidate and falls back automatically if that provider fails.

Real-time cost tracking and budgets

Every request is priced at the token level using per-model pricing. Costs accumulate per project, and hard spending limits (hourly, daily, weekly, monthly, or per-request) block overspending before it happens.

Multi-tenant project isolation

Each project gets its own Bearer token, its own model pool, its own routing configuration, and its own budget envelope. Ideal for multi-tenant SaaS products, or for keeping dev, staging, and production traffic cleanly separated.

Streaming and tool calling

Full streaming SSE support and OpenAI-compatible function/tool calling pass-through, including streaming tool call deltas.

Supported providers

OpenAI, Anthropic, Google Gemini, Ollama (local), Mistral, Cohere, xAI (Grok), and any custom HTTP endpoint. Cloud and local models can be mixed in the same project.

Built-in web dashboard

A React dashboard runs as part of the service:

  • Overview: live spend, call volume, success rate, daily cost trend
  • Models: manage registered models with provider badges and pricing data
  • Projects: configure model pools, routing policies, and budget envelopes
  • Usage analytics: per-model breakdown of calls, tokens, errors, and cost; filterable by period, project, and status
  • Users and Roles: RBAC with configurable roles and permissions
  • Light and dark theme

Admin CLI

The routerly CLI covers the full admin surface:

routerly status          # server health and uptime
routerly model add       # register a model
routerly project add     # create a project
routerly user add        # add a dashboard user
routerly role add        # define custom roles
routerly report usage    # pull usage reports
routerly service start   # start the local service

One-line installer

# macOS / Linux
curl -fsSL https://github.com/Inebrio/Routerly/releases/latest/download/install.sh | bash

# Windows (PowerShell)
powershell -c "irm https://www.routerly.ai/install.ps1 | iex"

The installer detects the platform, optionally installs Node.js 20+, builds the packages, generates an encryption key, sets up an auto-start daemon, and walks through first-run setup. On existing installs it presents an Update / Reinstall / Uninstall menu.

A Docker Compose image is also available for zero-dependency deployment (see the Docker Hub post).

Design principles

  • No external database: config and usage data live in plain JSON files
  • Fully decoupled: service, dashboard, and CLI can run on different machines
  • No telemetry: prompts and API keys never leave your infrastructure
  • Zero migration cost: change one environment variable, nothing else

Getting started

Full documentation is at docs.routerly.ai. Source code and issue tracker at github.com/Inebrio/Routerly.


Sources