# OfoxAI Documentation

> Complete developer documentation for OfoxAI — a unified LLM API gateway providing access to GPT-5.3 Codex, Claude Opus 4.6, Gemini 3.1 Pro, DeepSeek V3.2 and 100+ AI models through OpenAI / Anthropic / Gemini compatible APIs.

@canonical: https://docs.ofox.ai
@see-also: https://ofox.ai/llms.txt
@documentation: https://docs.ofox.ai
@last-updated: 2026-02-27

---

## Overview

OfoxAI is a unified LLM API gateway. One API key, 100+ models, three native protocols.

### API Endpoints

| Protocol | Base URL | Compatibility |
|----------|----------|---------------|
| OpenAI | https://api.ofox.ai/v1 | Full OpenAI SDK compatible |
| Anthropic | https://api.ofox.ai/anthropic | Native Anthropic SDK compatible |
| Gemini | https://api.ofox.ai/gemini | Native Google GenAI SDK compatible |

### Supported Models (100+)

**Flagship:** GPT-5.3 Codex, GPT-4.1, Claude Opus 4.6, Claude Sonnet 4.5, Gemini 3.1 Pro, Gemini 3 Flash
**Open Source:** DeepSeek V3.2, DeepSeek R1, Qwen3.5-Plus, Qwen3-235B, Kimi-K2.5, Llama 4 Maverick
**Specialized:** Grok 4, GLM-4-Plus, Yi-Lightning, Doubao Seed 2.0
**AIGC:** Sora, Kling, Flux (image/video), ElevenLabs, IndexTTS (voice)

---

## Quick Start

### 1. Get API Key

Sign up at https://app.ofox.ai to get your API key.

### 2. Configure Environment

**OpenAI compatible:**
```bash
export OPENAI_BASE_URL=https://api.ofox.ai/v1
export OPENAI_API_KEY=<your-ofoxai-key>
```

**Anthropic compatible:**
```bash
export ANTHROPIC_BASE_URL=https://api.ofox.ai/anthropic
export ANTHROPIC_AUTH_TOKEN=<your-ofoxai-key>
```

**Gemini compatible:**
```bash
export GEMINI_API_KEY=<your-ofoxai-key>
# Base URL: https://api.ofox.ai/gemini
```

### 3. Make Your First Call

**Python (OpenAI SDK):**
```python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.ofox.ai/v1",
    api_key="your-ofoxai-key"
)

response = client.chat.completions.create(
    model="gpt-5.3-codex",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
```

**TypeScript (OpenAI SDK):**
```typescript
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.ofox.ai/v1",
  apiKey: "your-ofoxai-key",
});

const response = await client.chat.completions.create({
  model: "gpt-5.3-codex",
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);
```

**cURL:**
```bash
curl https://api.ofox.ai/v1/chat/completions \
  -H "Authorization: Bearer your-ofoxai-key" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-5.3-codex","messages":[{"role":"user","content":"Hello!"}]}'
```

---

## Authentication

All requests require an API key via `Authorization: Bearer <key>` header.

- **OpenAI protocol:** Standard `Authorization: Bearer` or `api-key` header
- **Anthropic protocol:** `x-api-key` or `Authorization: Bearer` header
- **Gemini protocol:** `key` query parameter or `Authorization: Bearer` header

API keys can be created and managed at https://app.ofox.ai.

---

## API Reference

### Chat Completions (OpenAI Compatible)

**POST** `/v1/chat/completions`

Creates a model response for the given chat conversation. Supports streaming, function calling, vision, and structured output.

Key parameters:
- `model` (required): Model ID (e.g., "openai/gpt-5.3-codex", "anthropic/claude-opus-4.6", "google/gemini-3.1-pro-preview")
- `messages` (required): Array of message objects
- `stream`: Enable SSE streaming (default: false)
- `temperature`: Sampling temperature (0-2)
- `max_tokens`: Maximum tokens to generate
- `tools`: Function definitions for tool use
- `response_format`: JSON mode or JSON schema

### Anthropic Messages

**POST** `/anthropic/v1/messages`

Native Anthropic Messages API. Supports all Anthropic features including extended thinking, tool use, and vision.

Key parameters:
- `model` (required): Model ID
- `messages` (required): Array of message objects
- `max_tokens` (required): Maximum tokens
- `stream`: Enable SSE streaming
- `system`: System prompt
- `tools`: Tool definitions

### Gemini Generate Content

**POST** `/gemini/v1beta/models/{model}:generateContent`
**POST** `/gemini/v1beta/models/{model}:streamGenerateContent`

Native Gemini protocol. Supports multimodal input, function calling, and grounding.

### Embeddings

**POST** `/v1/embeddings`

Generate vector embeddings for text input. Compatible with OpenAI embedding models.

### Image Generation

**POST** `/v1/images/generations`

Generate images using DALL-E, Flux, Sora, and other AIGC models.

### Models

**GET** `/v1/models`

List all available models with pricing and capability information.

---

## Guides

### Streaming

Use SSE (Server-Sent Events) for real-time token streaming. Set `stream: true` in your request. All three protocols support streaming.

### Function Calling / Tool Use

Enable models to call external tools. Define functions in the `tools` parameter, receive tool call requests, execute them, and return results.

### Vision (Multimodal)

Send images and videos alongside text for multimodal analysis. Supported by GPT-4o, Claude Opus, Gemini Pro Vision, and other multimodal models.

### Structured Output

Force model output into JSON format using `response_format: { type: "json_object" }` or JSON Schema constraints.

### Error Handling

Standard HTTP error codes: 400 (bad request), 401 (unauthorized), 429 (rate limited), 500 (server error). All errors return JSON with `error.message` and `error.code`.

### Rate Limits

OfoxAI aggregates capacity across multiple providers. Generous rate limits with automatic scaling. Enterprise plans available for higher throughput.

---

## Advanced Features

### Provider Routing

Configure multi-provider routing strategies for cost optimization, latency reduction, or geographic affinity.

### Model Routing (Auto Mode)

Use `model: "auto"` or model pools to let OfoxAI intelligently select the best model based on task type, cost, and availability.

### Fallback

Automatic failover to backup models when primary provider is unavailable. Configurable fallback chains ensure 99.9% availability.

### Prompt Caching

Reduce cost and latency by caching repeated prompt prefixes. Supported for Claude and other models with native caching support.

---

## Tool Integrations

OfoxAI works with all major AI coding and productivity tools:

| Tool | Protocol | Setup Guide |
|------|----------|-------------|
| Claude Code | Anthropic | https://docs.ofox.ai/develop/integrations/claude-code |
| Cline (VS Code) | OpenAI/Anthropic | https://docs.ofox.ai/develop/integrations/cline |
| Cherry Studio | OpenAI | https://docs.ofox.ai/develop/integrations/cherry-studio |
| Zed Editor | OpenAI/Anthropic | https://docs.ofox.ai/develop/integrations/zed |
| OpenClaw | Anthropic | https://docs.ofox.ai/develop/integrations/openclaw |
| OpenCode | OpenAI/Anthropic | https://docs.ofox.ai/develop/integrations/opencode |
| Codex CLI | OpenAI | https://docs.ofox.ai/develop/integrations/codex |
| Gemini CLI | Gemini | https://docs.ofox.ai/develop/integrations/gemini-cli |
| OpenAI SDK | OpenAI | https://docs.ofox.ai/develop/integrations/openai-sdk |
| Windsurf | OpenAI | Change base URL in settings |
| Aider | OpenAI | Change base URL in config |

**Universal setup:** Just change the base URL and API key — no code changes needed.

---

## Observability

- **Dashboard:** Real-time overview of API usage, costs, and performance at https://app.ofox.ai
- **Usage Tracking:** Per-model token consumption, request counts, and latency metrics
- **Pricing:** Pay-as-you-go, no monthly fees. Flagship models 20% off, open-source up to 70% off. 10+ free models available.

---

## Key Facts

- **100+ models** from OpenAI, Anthropic, Google, DeepSeek, Alibaba, Moonshot, xAI, Meta, and more
- **Three native protocols:** OpenAI, Anthropic, Gemini — zero code changes to migrate
- **99.9% SLA** with automatic failover and multi-region deployment
- **China direct connect:** HK express routes, no VPN needed, low latency
- **Pay-as-you-go:** No monthly fees, token-level billing
- **10+ free models** available on free tier
- **Works with:** CherryStudio, Claude Code, Cline, Zed, OpenClaw, Codex, Gemini CLI, OpenCode, Windsurf, Aider