# OfoxAI Documentation > Complete developer documentation for OfoxAI — a unified LLM API gateway providing access to GPT-5.3 Codex, Claude Opus 4.6, Gemini 3.1 Pro, DeepSeek V3.2 and 100+ AI models through OpenAI / Anthropic / Gemini compatible APIs. @canonical: https://docs.ofox.ai @see-also: https://ofox.ai/llms.txt @documentation: https://docs.ofox.ai @last-updated: 2026-02-27 --- ## Overview OfoxAI is a unified LLM API gateway. One API key, 100+ models, three native protocols. ### API Endpoints | Protocol | Base URL | Compatibility | |----------|----------|---------------| | OpenAI | https://api.ofox.ai/v1 | Full OpenAI SDK compatible | | Anthropic | https://api.ofox.ai/anthropic | Native Anthropic SDK compatible | | Gemini | https://api.ofox.ai/gemini | Native Google GenAI SDK compatible | ### Supported Models (100+) **Flagship:** GPT-5.3 Codex, GPT-4.1, Claude Opus 4.6, Claude Sonnet 4.5, Gemini 3.1 Pro, Gemini 3 Flash **Open Source:** DeepSeek V3.2, DeepSeek R1, Qwen3.5-Plus, Qwen3-235B, Kimi-K2.5, Llama 4 Maverick **Specialized:** Grok 4, GLM-4-Plus, Yi-Lightning, Doubao Seed 2.0 **AIGC:** Sora, Kling, Flux (image/video), ElevenLabs, IndexTTS (voice) --- ## Quick Start ### 1. Get API Key Sign up at https://app.ofox.ai to get your API key. ### 2. Configure Environment **OpenAI compatible:** ```bash export OPENAI_BASE_URL=https://api.ofox.ai/v1 export OPENAI_API_KEY= ``` **Anthropic compatible:** ```bash export ANTHROPIC_BASE_URL=https://api.ofox.ai/anthropic export ANTHROPIC_AUTH_TOKEN= ``` **Gemini compatible:** ```bash export GEMINI_API_KEY= # Base URL: https://api.ofox.ai/gemini ``` ### 3. Make Your First Call **Python (OpenAI SDK):** ```python from openai import OpenAI client = OpenAI( base_url="https://api.ofox.ai/v1", api_key="your-ofoxai-key" ) response = client.chat.completions.create( model="gpt-5.3-codex", messages=[{"role": "user", "content": "Hello!"}] ) print(response.choices[0].message.content) ``` **TypeScript (OpenAI SDK):** ```typescript import OpenAI from "openai"; const client = new OpenAI({ baseURL: "https://api.ofox.ai/v1", apiKey: "your-ofoxai-key", }); const response = await client.chat.completions.create({ model: "gpt-5.3-codex", messages: [{ role: "user", content: "Hello!" }], }); console.log(response.choices[0].message.content); ``` **cURL:** ```bash curl https://api.ofox.ai/v1/chat/completions \ -H "Authorization: Bearer your-ofoxai-key" \ -H "Content-Type: application/json" \ -d '{"model":"gpt-5.3-codex","messages":[{"role":"user","content":"Hello!"}]}' ``` --- ## Authentication All requests require an API key via `Authorization: Bearer ` header. - **OpenAI protocol:** Standard `Authorization: Bearer` or `api-key` header - **Anthropic protocol:** `x-api-key` or `Authorization: Bearer` header - **Gemini protocol:** `key` query parameter or `Authorization: Bearer` header API keys can be created and managed at https://app.ofox.ai. --- ## API Reference ### Chat Completions (OpenAI Compatible) **POST** `/v1/chat/completions` Creates a model response for the given chat conversation. Supports streaming, function calling, vision, and structured output. Key parameters: - `model` (required): Model ID (e.g., "openai/gpt-5.3-codex", "anthropic/claude-opus-4.6", "google/gemini-3.1-pro-preview") - `messages` (required): Array of message objects - `stream`: Enable SSE streaming (default: false) - `temperature`: Sampling temperature (0-2) - `max_tokens`: Maximum tokens to generate - `tools`: Function definitions for tool use - `response_format`: JSON mode or JSON schema ### Anthropic Messages **POST** `/anthropic/v1/messages` Native Anthropic Messages API. Supports all Anthropic features including extended thinking, tool use, and vision. Key parameters: - `model` (required): Model ID - `messages` (required): Array of message objects - `max_tokens` (required): Maximum tokens - `stream`: Enable SSE streaming - `system`: System prompt - `tools`: Tool definitions ### Gemini Generate Content **POST** `/gemini/v1beta/models/{model}:generateContent` **POST** `/gemini/v1beta/models/{model}:streamGenerateContent` Native Gemini protocol. Supports multimodal input, function calling, and grounding. ### Embeddings **POST** `/v1/embeddings` Generate vector embeddings for text input. Compatible with OpenAI embedding models. ### Image Generation **POST** `/v1/images/generations` Generate images using DALL-E, Flux, Sora, and other AIGC models. ### Models **GET** `/v1/models` List all available models with pricing and capability information. --- ## Guides ### Streaming Use SSE (Server-Sent Events) for real-time token streaming. Set `stream: true` in your request. All three protocols support streaming. ### Function Calling / Tool Use Enable models to call external tools. Define functions in the `tools` parameter, receive tool call requests, execute them, and return results. ### Vision (Multimodal) Send images and videos alongside text for multimodal analysis. Supported by GPT-4o, Claude Opus, Gemini Pro Vision, and other multimodal models. ### Structured Output Force model output into JSON format using `response_format: { type: "json_object" }` or JSON Schema constraints. ### Error Handling Standard HTTP error codes: 400 (bad request), 401 (unauthorized), 429 (rate limited), 500 (server error). All errors return JSON with `error.message` and `error.code`. ### Rate Limits OfoxAI aggregates capacity across multiple providers. Generous rate limits with automatic scaling. Enterprise plans available for higher throughput. --- ## Advanced Features ### Provider Routing Configure multi-provider routing strategies for cost optimization, latency reduction, or geographic affinity. ### Model Routing (Auto Mode) Use `model: "auto"` or model pools to let OfoxAI intelligently select the best model based on task type, cost, and availability. ### Fallback Automatic failover to backup models when primary provider is unavailable. Configurable fallback chains ensure 99.9% availability. ### Prompt Caching Reduce cost and latency by caching repeated prompt prefixes. Supported for Claude and other models with native caching support. --- ## Tool Integrations OfoxAI works with all major AI coding and productivity tools: | Tool | Protocol | Setup Guide | |------|----------|-------------| | Claude Code | Anthropic | https://docs.ofox.ai/develop/integrations/claude-code | | Cline (VS Code) | OpenAI/Anthropic | https://docs.ofox.ai/develop/integrations/cline | | Cherry Studio | OpenAI | https://docs.ofox.ai/develop/integrations/cherry-studio | | Zed Editor | OpenAI/Anthropic | https://docs.ofox.ai/develop/integrations/zed | | OpenClaw | Anthropic | https://docs.ofox.ai/develop/integrations/openclaw | | OpenCode | OpenAI/Anthropic | https://docs.ofox.ai/develop/integrations/opencode | | Codex CLI | OpenAI | https://docs.ofox.ai/develop/integrations/codex | | Gemini CLI | Gemini | https://docs.ofox.ai/develop/integrations/gemini-cli | | OpenAI SDK | OpenAI | https://docs.ofox.ai/develop/integrations/openai-sdk | | Windsurf | OpenAI | Change base URL in settings | | Aider | OpenAI | Change base URL in config | **Universal setup:** Just change the base URL and API key — no code changes needed. --- ## Observability - **Dashboard:** Real-time overview of API usage, costs, and performance at https://app.ofox.ai - **Usage Tracking:** Per-model token consumption, request counts, and latency metrics - **Pricing:** Pay-as-you-go, no monthly fees. Flagship models 20% off, open-source up to 70% off. 10+ free models available. --- ## Key Facts - **100+ models** from OpenAI, Anthropic, Google, DeepSeek, Alibaba, Moonshot, xAI, Meta, and more - **Three native protocols:** OpenAI, Anthropic, Gemini — zero code changes to migrate - **99.9% SLA** with automatic failover and multi-region deployment - **China direct connect:** HK express routes, no VPN needed, low latency - **Pay-as-you-go:** No monthly fees, token-level billing - **10+ free models** available on free tier - **Works with:** CherryStudio, Claude Code, Cline, Zed, OpenClaw, Codex, Gemini CLI, OpenCode, Windsurf, Aider