v0.4.1 · MVP · Live in production

One API.
Every model.

Helix is the unified gateway for OpenAI, Anthropic, Google, Mistral and 20+ LLM providers. Smart routing, semantic caching, automatic fallback — through a single endpoint your team already knows.

Start building free → See how it works

<50ms overhead 22 providers 99.99% uptime SLA

route.ts

// Drop-in for any OpenAI-compatible client.
import { Helix } from "@helix/sdk";

const helix = new Helix({ apiKey: process.env.HELIX_KEY });

const reply = await helix.chat.complete({
  model: "auto",  // or "gpt-4o", "claude-opus-4", "gemini-2.5"
  messages: [{ role: "user", content: "Summarize Q3." }],
  fallback: ["claude-opus-4", "gpt-4o"],
  cache: true,
});

// → routed via Anthropic. cached. logged. billed.

The pain

Your AI stack is a graveyard
of half-broken SDKs.

Every team using LLMs in production hits the same wall. New models ship weekly. Pricing changes monthly. One provider goes down and your app dies. Helix is the abstraction layer that should have shipped two years ago.

01
You're locked into one vendor. Switching from OpenAI to Claude takes a sprint, not an afternoon. Every breaking change costs you.
02
You're paying twice for the same answer. No caching. No deduplication. Your bill scales linearly with traffic — even when it shouldn't.
03
You can't see what's actually happening. Per-model latency? Per-customer cost? Failure rates? Welcome to your console.log.
04
One outage takes you down. When OpenAI has a 4-hour incident, your product has a 4-hour incident. There is no plan B.

What's inside

Six things your backend shouldn't have to build.

Helix isn't a wrapper. It's the production-grade infrastructure layer between your code and every model on the market.

01 / Compatibility

Drop-in for the OpenAI SDK.

Change one base URL. Keep your code. Helix speaks the OpenAI API spec natively, so existing clients in any language work unchanged.

No rewrite required

02 / Routing

Smart routing by cost, speed, or quality.

Let Helix pick the right model per request. Optimize for cheapest, fastest, or highest-rated — configurable per endpoint, per customer, per call.

Auto mode available

03 / Resilience

Automatic fallback chains.

When OpenAI 503s, Claude takes the request. When Claude rate-limits, Gemini does. Your users never know the difference.

Sub-second failover

04 / Caching

Semantic cache. Pay once.

Helix caches by meaning, not just exact match. Two prompts that mean the same thing? Same answer, no second bill. Cuts inference cost up to 60%.

Avg 38% cost reduction

05 / Analytics

Real-time observability.

Per-model cost, latency, error rate, token usage, customer attribution. Everything your finance and SRE teams have been asking for.

Datadog + Grafana exports

06 / Edge

Built on the edge.

Deployed across 280+ Cloudflare PoPs. Routing decisions happen closer to your user than the model itself. Less than 50ms overhead, globally.

Cloudflare Workers

From zero to routed

Three lines. Five minutes.

Helix is designed to disappear into your existing codebase. If your LLM client supports a base URL, you're done.

STEP / 01

Point your client at Helix.

Replace one URL. Done. Helix is wire-compatible with the OpenAI Chat Completions API.

STEP / 02

Configure routing rules.

Choose models. Set fallback chains. Define caching strategy. All optional — defaults just work.

STEP / 03

Watch the dashboard light up.

Cost, latency, errors, savings — all live. Export to Datadog, Grafana, or your warehouse.

step-1.ts

// Step 1 — Point your existing OpenAI client at Helix.
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.helix.dev/v1",
  apiKey: process.env.HELIX_KEY,
});

// That's it. Your existing code now routes through Helix.
const r = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello" }],
});

Built-in dashboard

Every request, accounted for.

The observability layer your AI bill has been begging for. Drill into cost-per-customer, model performance, cache hit rates — in real time.

Overview Models Customers Logs Settings

LAST 24H · UPDATED 00:14

Requests routed

128,442

▲ 14.2% vs yesterday

Avg latency

312ms

▲ 8ms vs yesterday

Saved via cache

$847

38% hit rate

Requests by provider

openai

52,118

anthropic

37,402

google

21,089

mistral

11,944

groq

5,889

Pricing & monetization

A model that scales with usage.

Free to start. Subscription tiers for serious teams. A 1.5% routing fee on enterprise volume. Three independent revenue streams baked into the product from day one.

Hobby

$0/mo

For developers and side projects exploring multi-provider workflows.

10,000 requests / month
All 22 providers
Basic routing & fallback
Community support on Discord
7 days of logs

Start free

Pro Most popular

$49/mo

For production teams running real LLM workloads with strict SLAs.

500,000 requests / month
Smart cost-based routing
Semantic caching (38% avg savings)
Per-customer cost attribution
30 days of logs & analytics
Priority email support

Start 14-day trial →

Scale

$299/mo

For high-volume teams and AI-native products at serious scale.

5M requests / month
Custom routing logic & A/B
Dedicated regional endpoints
SOC 2 reporting & audit logs
99.99% uptime SLA
Slack channel support

Talk to founder

      ENTERPRISE · 1.5% ROUTING FEE ON USAGE ABOVE TIER · CONTACT FOR CUSTOM TERMS
    

Asset overview · for acquirers

A clean, deployed MVP — ready for the right operator.

LISTED ON exitbid.io · pre-revenue · MVP
Asking range available on request

Stage

MVP · live

Fully functional. Production-deployed. Pre-launch.

Tech stack

Edge-first

TypeScript · Cloudflare Workers · Postgres · Stripe

Market (TAM)

$12B+

LLM API spend, projected 2026 (a16z, Menlo Ventures).

Margin profile

85%+

SaaS-style economics. Negligible variable cost per request.

What you're acquiring.

Codebase Clean TypeScript monorepo, fully typed, >80% test coverage.Documented · zero technical debt · MIT-licensed dependencies only.
Infrastructure Deployed on Cloudflare Workers + Neon Postgres.~$80/mo to operate at current scale. Scales linearly.
Brand Domain (helix.dev), full identity system, this landing page.Logo, brand book, social handles, email sequences.
Integrations 22 LLM providers wired up. Stripe billing live.SDK published to npm. Dashboard at dash.helix.dev.
Documentation Complete dev docs, API reference, runbooks, onboarding video.A new operator can ship a feature on day one.
Transition 30 days of founder support post-acquisition.Code walkthrough · architecture deep-dive · vendor handoffs.

Why this is a real opportunity.

Timing The LLM gateway category is forming right now.No clear winner. First-mover advantage still available with a strong sales motion.
Monetization Three independent revenue streams already coded.Subscription tiers · per-request markup · enterprise contracts.
Distribution Developer tool with viral GTM potential.Open-source SDK · Show HN ready · clear Product Hunt narrative.
Stickiness Once in the request path, hard to remove.Cache, analytics, billing all flow through Helix — meaningful switching cost.
Operator fit Ideal for a technical founder or AI infra team.Solo operator can run it. Team of 2–3 can scale it to seven figures ARR.
Risk profile Low. No paying customers means no churn risk — just upside.Acquirer captures all the value of the build phase without market risk.

Questions, anticipated

Frequently asked.

Yes — bit-for-bit. Helix implements the OpenAI Chat Completions, Embeddings, and Moderations endpoints. You change the baseURL on your existing client and everything keeps working. No new SDK to learn unless you want the advanced routing features.

Helix embeds incoming prompts and compares them to recent requests using vector similarity. If a new prompt is >94% similar to a cached one (configurable), the cached response is returned instantly — no inference call, no cost. Average customer sees a 38% reduction in inference spend.

Helix retries on the same provider once, then falls back to your defined chain (e.g. [claude-opus-4, gemini-2.5]). The whole failover happens in under 800ms. Your end users never see a 500.

Request and response payloads are never persisted by default. Metadata (model, latency, token counts, cost) is stored for 30 days in Pro and longer on Scale. Full payload logging is opt-in, customer-controlled, and encrypted at rest. SOC 2 Type II in progress.

The product is built and the architecture is sound, but the founder's bandwidth is committed elsewhere. Rather than letting a clean MVP sit idle in a market that's moving fast, this is a chance for the right operator to take it from MVP to revenue with a meaningful head start — codebase, brand, infrastructure, and positioning all included.

Yes. Serious buyers receive read-only repository access and a 60-minute architecture walkthrough call after signing a standard NDA via exitbid. Code review reveals: ~9,400 lines of TypeScript, full test suite, deployed Stripe billing, working dashboard, and complete provider integrations.

One API. Every model.

Your AI stack is a graveyard of half-broken SDKs.