Encrypted Inference

All models. All agents.
End-to-end encrypted.

One inference path for every coding agent โ€” Claude, Codex, Blackbox, Grok, and every model on the platform. Encrypted from your device to the model and back, with customer-managed keys and zero data retention. Not even Blackbox can read your prompts or completions.

inference-pathend-to-end

Your device

Client

Encrypted

Blackbox

Zero-knowledge proxy

Encrypted

Every Model

+24

Prompts ยท completions โ€” never readable by Blackbox

Teams at Fortune 500 companies that depend on BLACKBOX.AI

Deloitte
Microsoft
Intel
Accenture
Apple
Amazon
Salesforce
Google
Infosys
Oracle
Capgemini
GitHub
ByteDance
Cisco
PwC
SAP
Cognizant
Benchmark ยท Same model ยท 7 providers

Same model.
Faster. Cheaper.

Artificial Analysis independently benchmarked seven providers serving the same open-weights modelโ€” NVIDIA's Nemotron 3 Ultra 550B. BLACKBOX ranked #1 on all three axes at once: output speed, time-to-first-token, and blended price. Same weights as everyone else. We just serve them faster, for less.

Read the benchmark ยท Artificial Analysis
speed ร— price ยท 7 providers[ FIG.01 / SCATTER ]

Speed ร— price

best = top-left
Most attractive0100200300400500$0.0$0.2$0.4$0.6$0.8$1.0$1.2$1.4Price ยท USD / 1M tokens โ†’Output speed ยท t/s โ†‘BLACKBOX AICoreWeaveDeepInfra BF16DeepInfraTogether.aiLightning AINebius
Blended 7:2:1 ยท cache-input-output
output throughput ยท 7 providers[ FIG.02 / TOKENS ]

Output throughput

t/s ยท higher is better
BLACKBOX AI453.3
CoreWeave257.2
DeepInfra BF16217.3
DeepInfra189.9
Together.ai157.8
Nebius142.0
Lightning AI82.0
#1 Fastest453 t/s
#1 Lowest latency6.03 s
#1 Lowest price$0.44
Latency ranked among 5 providers reporting TTFT
AGENT-01 // REFACTOR
AGENT-02 // MIGRATE
AGENT-03 // TEST-GEN
AGENT-04 // DEPLOY
CHAIRMAN LLM // JUDGE
SYSTEM // MONITOR
AGENT-05 // REVIEW
AGENT-06 // DOCS
AGENT-07 // SECURITY
AGENT-08 // PERF
SYSTEM // EVENT LOG
SYSTEM // TASK QUEUE
AGENT-09 // SCAFFOLD
AGENT-10 // TRANSLATE
AGENT-11 // ROLLBACK
SYSTEM // NETWORK
AGENT-12 // LINT-FIX
AGENT-13 // CANARY
AGENT-14 // SCHEMA
SYSTEM // HEARTBEAT
Multi-Harness

Any agent.
Any model.
One encrypted layer.

Claude Code ยท Codex ยท Blackbox ยท Gemini ยท Goose ยท +7 more

Run the coding agents your team already trusts on one encrypted inference layer. One bill. One audit trail. Zero data retention.

12+
agents
24+
models
1
API
0
data retention
Your Agent Platform

Dispatch from anywhere,
anytime, autonomously.

One platform, every surface. Dispatch autonomous coding agents from your terminal, IDE, or API. They compete, collaborate, and ship codeย โ€”ย while you focus on what matters.

CLI

Your terminal, supercharged

Dispatch competing agents from a single command. They analyze your codebase, generate solutions in parallel, and open PRsย โ€”ย no browser needed.

  • Multi-agent parallel execution
  • Automatic PR creation
  • CI/CD pipeline integration

API

Programmable agent execution

Integrate agent execution into any workflow with OpenAI-compatible endpoints. Chat completions, multi-agent orchestration, and real-time streaming.

  • OpenAI-compatible endpoints
  • Multi-agent orchestration
  • WebSocket streaming

IDE

Agents in your editor

Agents work alongside you inside VSย Code or Blackboxย IDE. Real-time code generation, refactoring, and testingย โ€”ย right where you write code.

  • Inline code generation
  • Context-aware refactoring
  • Integrated test runner
Chairman LLM

Agents compete.
Best output wins.

Dispatch the same task to multiple AI agents, then let Chairman LLM evaluate every candidate on correctness, performance, risk, and complexity.

Task

Implement rate limiting middleware with Redis backend for the API gateway

Claude Codewinner

I'll use a sliding window algorithm with Redis MULTI/EXEC for atomicity. The middleware checks req count per IP in a 60s window, returns 429 when exceeded.

Codex

Implementing token bucket via Redis INCR + EXPIRE. Each request decrements the bucket; refill rate is configurable per route. Includes retry-after header.

Blackbox

I recommend a distributed rate limiter using Redis sorted sets for precise sliding windows. Supports per-user and per-endpoint limits with graceful degradation.

Winner Selected

claude code

confidence: 0.94

TESTS: 46/46correctness: 0.97
PR #218 opened
src/middleware/rate-limit.ts+47 -12
src/config/redis.ts+18 -3
tests/rate-limit.test.ts+94 -0
3 files changed+159 -15

Parallel Dispatch

Same task, multiple agents. Blackbox, Claude Code, and Codex work simultaneously โ€” each producing an independent solution.

Weighted Evaluation

Chairman LLM scores every candidate across correctness, performance, risk, and complexity โ€” fully configurable per task.

Ship Automatically

The winning solution is packaged into a PR with test results, evaluation breakdown, and diff โ€” ready to merge in one click.

Inference Engine

Inference that's 60% faster
than calling OpenAI direct.

The BLACKBOX gateway runs every request through an optimized inference engine โ€” the same frontier models, served with dramatically higher output throughput. One encrypted endpoint, measurably faster tokens.

  • 60% higher output throughput vs. direct OpenAI

    The same model, served through the BLACKBOX inference engine, streams output tokens up to 60% faster than calling the provider directly โ€” measured on identical prompts, side by side.

  • Drop-in OpenAI-compatible endpoint
  • End-to-end encrypted inference
  • Automatic multi-provider failover
Get API access
OpenAI direct vs. BLACKBOX inference
+60% output tpsSame modelsE2EE by default
System Status

All models.
All online.

Live latency and uptime metrics across every model on the BLACKBOX platform โ€” updated every few seconds, straight from the inference gateway.

8/8Models live
99.96%Avg uptime
โ€”Last refresh
blackbox-gateway / model-healthLive
claude-3-7-sonnet
Anthropic ยท Operational
99.98%

uptime

gpt-4o
OpenAI ยท Operational
99.95%

uptime

gemini-2.5-pro
Google ยท Operational
99.97%

uptime

blackbox-v3
Blackbox ยท Operational
99.99%

uptime

deepseek-v3
DeepSeek ยท Operational
99.91%

uptime

grok-3
xAI ยท Operational
99.94%

uptime

llama-3.3-70b
Meta ยท Operational
99.96%

uptime

qwen-2.5-72b
Alibaba ยท Operational
99.93%

uptime

Metrics sourced from the BLACKBOX inference gateway ยท TTFT = time to first token

Live Comparison

Every agent harness.
One platform.

Claude Code, Codex, Blackboxย โ€”ย access every coding agent through a single API. Compare harnesses side by side and ship the best result.

src/middleware/rate-limiter.ts+35-8
1 import { Redis } from '@upstash/redis';
2 import type { NextRequest } from 'next/server';
3
4-const RATE_LIMIT = 100;
5-const WINDOW_MS = 60_000;
6+interface SlidingWindowConfig {
7+ maxRequests: number;
8+ windowMs: number;
9+ keyPrefix?: string;
10+}
11+
12+const DEFAULT_CONFIG: SlidingWindowConfig = {
13+ maxRequests: 100,
14+ windowMs: 60_000,
15+ keyPrefix: 'rl:sw',
16+};
17
18-export async function rateLimit(req) {
19- const ip = req.headers.get('x-forwarded-for');
20- const count = await redis.incr(ip);
21- if (count === 1) await redis.expire(ip, 60);
22- return count <= RATE_LIMIT;
23-}
24+export async function rateLimit(
25+ req: NextRequest,
26+ config = DEFAULT_CONFIG
27+) {
28+ const ip = req.headers.get('x-forwarded-for') ?? '127.0.0.1';
29+ const now = Date.now();
30+ const windowStart = now - config.windowMs;
31+ const key = `${config.keyPrefix}:${ip}`;
32+
33+ // Atomic sliding window via MULTI
34+ const pipeline = redis.multi();
35+ pipeline.zremrangebyscore(key, 0, windowStart);
36+ pipeline.zadd(key, { score: now, member: crypto.randomUUID() });
37+ pipeline.zcard(key);
38+ pipeline.expire(key, Math.ceil(config.windowMs / 1000));
39+ const results = await pipeline.exec();
40+
41+ const count = results[2] as number;
42+ return {
43+ allowed: count <= config.maxRequests,
44+ remaining: Math.max(0, config.maxRequests - count),
45+ resetAt: now + config.windowMs,
46+ };
47+}