Your AI dev server. Always on. Always with you.
A hosted workspace with persistent memory, RAG, files, vault, and an AI panel — on Cloudflare's edge. Use it from Claude Code, Codex, or any MCP-compatible CLI. Code from any browser. Pick up exactly where you left off.
> add a /healthcheck route that returns
build sha and uptime✓ workspace_files_list → src/{index,router,health}.ts
✓ workspace_files_read src/router.ts
✓ workspace_files_write src/health.ts (14 lines)
✓ workspace_files_write src/router.ts (added route)
Added GET /healthcheck → { sha, uptime_ms }.
Wired it into router.ts at line 23.
Tested locally: 200 OK in 0.6ms.Three reasons your terminal isn't enough anymore.
You can run an agent on your laptop. You shouldn't have to.
Always on. Never reset.
Your files, memory, and open conversations live on the edge — not on the laptop you forgot to plug in. Sign in from a borrowed Chromebook and pick up exactly where you left off.
Agent co-located with your code.
The Claude agent runs in the same worker as your file tree, your RAG, and your memory. Tool calls execute as direct binding hits — no API round-trip, no context window spent re-explaining your project.
Production-adjacent.
Every workspace is already on Cloudflare's edge. The same network that serves your dev shell will serve your users. "Works on my machine" stops being a category of bug.
Eight primitives. Use them via the agent, or call each one directly.
Every primitive is a service in its own right — RAG, memory, vault, and the AI panel each have their own API. The agent loop is what makes them feel like one product. Pick what you need.
Workspace
A persistent file tree per (operator, end_user). Read, write, list, delete — scoped to the JWT.
POST /v1/workspace/files/writeagent: workspace_files_writeAgent loop
Claude with workspace, RAG, and memory tools wired in. One call, autonomous loop, full transcript back.
POST /v1/workspace/agent/chatiox chat "ship the changes"Memory
Per-user durable memory with full-text recall, hybrid scoring, and importance decay.
POST /v1/memory/{remember,recall}agent: memory_rememberRAG
Hybrid retrieval: vector + BM25 + cross-encoder rerank. Per-operator indexes with hard tenant isolation.
POST /v1/rag/query ?hybrid&rerankagent: rag_queryVault
Per-operator encrypted secrets. KEK derived per tenant, never the same key twice.
POST /v1/vault/putiox vault put oracle-prod-pwdAI Panel
Pick your panel: 1–8 participants, optional role per model. Mix Claude, GPT-5, Gemini, Grok, Workers AI (Llama, Kimi, DeepSeek, Qwen), or your own custom endpoint.
POST /v1/panel/consult { participants: [...] }iox panel --role 'security reviewer' opusGit
v0.2Workspace history, commits, branches. Push to GitHub or stay private inside IOX.
POST /v1/workspace/git/{commit,push}agent: git_commitDeploys
v0.2One-command deploy from a workspace to Cloudflare Pages or Workers. Preview URLs per branch.
POST /v1/workspace/deployagent: deploy_to_edgeSolid border = direct API · Dashed = same primitive used by the agent
Claude Code. Codex. Any MCP-compatible client.
IOX ships an MCP server. Your local CLI stays your local CLI — we just hand it persistent files, RAG, memory, and the AI panel to operate on. One install, no lock-in to either model vendor.
Claude Code
by Anthropic
The CLI you already use. IOX exposes workspace files, RAG, memory, vault, and panel as MCP tools — install once and Claude Code operates on your hosted state.
# install Claude Code (if you haven't)
npm i -g @anthropic-ai/claude-code
# wire it to IOX
claude mcp add iox npx -- @iox/mcp-server \
--token $IOX_TOKEN
# work
claudeCodex CLI
by OpenAI
OpenAI's open-source coding agent. Same MCP transport, same IOX tools — pick the model you prefer and your workspace stays the same on either side.
# install Codex CLI
npm i -g @openai/codex
# add IOX as an MCP server
codex mcp add iox npx @iox/mcp-server \
--token $IOX_TOKEN
# work
codex@iox/mcp-server is built and proven (10 tools, stdio MCP, smoke-tested) — publishing to npm this week with the v0.1 invite wave. Beta testers get a direct tarball link in their welcome email.
Prefer a hosted shell with the CLI pre-installed? That's the v0.2 Cloudflare Containers path — same workspace, same files, accessed via a web terminal instead of your local one.
IOX runs entirely on Cloudflare. That isn't a slide — it's the design.
Per-operator state lives in a Durable Object, which is itself a tenant boundary. Vectors share an index but are filtered by metadata. Files share a bucket but are prefixed by JWT scope. We didn't bolt isolation on; we picked the primitives that already had it.
- <5msCold start (DO)
- 330+Edge POPs
- $0R2 egress
- StructuralTenant boundary
Same workspace. White-labeled. Resold.
The same primitives that power your personal dev server can be the foundation of an offering you sell. Each end-client gets their own workspace, their own vault, their own RAG. You set the price, brand, and terms.
Real example. An Oracle ERP consultancy ships engagements for half a dozen Fusion customers. With IOX, every customer gets a workspace pre-loaded with their credentials, their BIP / Fusion docs in a private RAG index, and a Claude agent that knows their domain. The consultancy ships under erp.theirbrand.com; we never appear.
Per-tenant isolation isn't a config flag — it's structural. Different operator = different Durable Object. Different end-user = different scope on every binding.
POST /v1/workspace/create
{
"name": "acme-prod",
"rag_index": "acme-fusion"
}iox rag ingest acme-fusion \
--from ./acme-bipdocs/*.md --hybridiox chat --workspace acme-prod \
"explain why InvoicePayloadProvider returns
null for fiscal_year 2026"Honest about token costs. No surprise invoices.
The free tier exists so you can actually try the product. We cap it on small models because we'd rather throttle you than send you a bill in month two. BYOK is a first-class option at every tier.
Free
For trying it out and small personal projects.
- 1 workspace
- 100MB R2 + 10K vectors
- 50 agent calls / month on Haiku & Sonnet
- Or BYOK — point at your own Anthropic / OpenAI key
- RAG, memory, vault, AI panel — all included
- Community Discord
Opus + GPT-5-class models are paid-tier only. Embeddings + rerank capped daily.
Join the waitlistPro
RecommendedFor developers using IOX day-to-day.
- Unlimited workspaces
- 10GB R2 + 1M vectors
- $15 of LLM credit / month at wholesale
- All models including Opus & GPT-5
- Hybrid retrieval + BGE rerank, no daily cap
- Email support, 1-day SLO
Token overage metered transparently at ~30% markup. Status page shows your spend live.
Join the waitlistTeam & Operator
For teams of 3+ or anyone reselling to clients.
- 5 seats included, $20/seat after
- Operator features: white-label, Stripe Connect, per-tenant quotas
- Per-tenant Vectorize + KV isolation
- Custom domain, your branding
- Slack channel, 1-hour SLO
- Self-host guide on request
Operator pricing scales with end-user count — talk to us for unit economics that fit.
Talk to usWhy the small free tier? An 8-step agent loop on Claude Opus burns ~$0.05. A truly free 1K-call quota would cost us $50/user/month — which means we'd disappear before you could rely on us. Small free + BYOK is the honest version of "free forever".
The questions devs actually ask.
We don't soften answers. If something's not built yet, it says "in progress" and ships when it ships.
How is this different from the Vercel AI SDK?+
Vercel AI SDK is a client library for talking to LLMs. IOX is a multi-tenant backend for hosting end-user state — workspaces, files, memory, RAG, agents — that LLMs operate on. You'd use them together: the AI SDK in your frontend, IOX as the persistence and tool-execution layer behind it.
What's the data isolation guarantee?+
End-user data lives behind two structural barriers: a per-(operator, end_user) Durable Object for memory and a per-(operator) prefix on R2 + Vectorize metadata filter for files and vectors. Cross-tenant reads aren't blocked by an `if` statement — they're blocked because the data is in a different DO instance or behind a different filter. The JWT carries the scope; the storage layer enforces it.
Can I bring my own model?+
Yes — three ways. (1) BYOK: point IOX at your own Anthropic / OpenAI / Gemini / xAI key and we just route. (2) Custom endpoint: register any OpenAI-compatible URL (your fine-tuned Llama on Together, your on-prem GPU, Replicate, RunPod, Modal) and IOX treats it like any other model — same panel, same agent loop, same cost ceiling. (3) Workers AI native: 7+ models (Llama 3.3 70B, Kimi K2.6, DeepSeek R1, Qwen Coder, GLM Flash, Granite Micro, GPT-OSS 120B) bundled into every workspace at CF wholesale prices.
Can I configure who's on the AI Panel?+
Yes. The panel takes a `participants: [{ model, role? }]` array — pick 1 to 8 models, assign each one a role like 'security reviewer' or 'cost analyst', and the synthesizer reconciles. Defaults to a 4-vendor cross-lab consensus (Claude / GPT-5 / Gemini / Grok) when you don't specify. Shipped today.
Does Claude CLI work with IOX?+
Yes — from day one, via MCP. Install the `@iox/mcp-server` package, run `claude mcp add iox npx @iox/mcp-server`, and your local Claude Code gets a set of IOX tools: read/write your workspace files, query your RAG indexes, recall and remember per-user memory, consult the AI panel. Claude Code stays on your laptop where you already use it; IOX provides the persistent hosted state it operates on. (A second path — a hosted Ubuntu shell with Claude Code pre-installed — ships in v0.2 once Cloudflare Containers proves out for our workload.)
Can I self-host?+
Operator tier customers get a deployment guide for running IOX on their own Cloudflare account. The dispatch worker, DOs, R2, KV, D1, and Vectorize bindings all transfer cleanly — wrangler does the lift. You bring your own API keys for the AI Gateway.
Where does the agent run?+
Inside the dispatch worker. Tool calls hit our R2 / Vectorize / DO bindings directly — no extra network hop. Latency for an 8-step agent loop runs ~1.2s on warm cache; the cold path adds ~30ms for the JWT verify. We don't spin up containers per session, so there's no provisioning latency to amortize.
What happens if I exceed my quota?+
Hard caps return 429s with a JSON body explaining which limit hit. We don't auto-overage on Personal or Pro — you have to opt into Operator tier for usage-based pricing. We'd rather explain a 429 than send you a surprise invoice.
Is this SOC2 / GDPR ready?+
GDPR: yes — Cloudflare data residency controls apply, you can pin DO + R2 + Vectorize to EU. SOC2: in progress (Type 1 report Q3 2026). Until then, Operator tier customers can request our security review packet.
The dev server you'll actually leave running.
We're onboarding in small batches while we tune the free-tier economics. Drop your email — invites go out within the week.