Skip to Content

Chat Completions

Create chat completion responses. Supports text generation, multimodal input, Function Calling, streaming, and more.

Endpoint

POST https://api.ofox.ai/v1/chat/completions

Request Parameters

ParameterTypeRequiredDescription
modelstringModel identifier, e.g. openai/gpt-4o
messagesarrayMessage array
temperaturenumberSampling temperature 0-2, default 1
max_tokensnumberMaximum tokens to generate
streambooleanEnable streaming response
top_pnumberNucleus sampling parameter
frequency_penaltynumberFrequency penalty -2 to 2
presence_penaltynumberPresence penalty -2 to 2
toolsarrayTool definitions (Function Calling)
tool_choicestring/objectTool selection strategy
response_formatobjectResponse format (JSON Mode)
providerobjectOfoxAI extension: routing and fallback config

Message Format

interface Message { role: 'system' | 'user' | 'assistant' | 'tool' content: string | ContentPart[] // Text or multimodal content name?: string tool_calls?: ToolCall[] // Tool calls in assistant messages tool_call_id?: string // Call ID in tool messages } // Multimodal content type ContentPart = | { type: 'text'; text: string } | { type: 'image_url'; image_url: { url: string; detail?: 'auto' | 'low' | 'high' } }

Request Examples

Terminal
curl https://api.ofox.ai/v1/chat/completions \ -H "Authorization: Bearer $OFOX_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-4o", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Explain what an API Gateway is"} ], "temperature": 0.7 }'

Response Format

{ "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1703123456, "model": "openai/gpt-4o", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "An API Gateway is a..." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 25, "completion_tokens": 150, "total_tokens": 175 } }

Streaming

Set stream: true to enable SSE streaming responses:

stream.py
stream = client.chat.completions.create( model="openai/gpt-4o", messages=[{"role": "user", "content": "Tell me a story"}], stream=True ) for chunk in stream: content = chunk.choices[0].delta.content if content: print(content, end="", flush=True)

Streaming Response Format

Each chunk is sent via SSE:

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]} data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" there"},"finish_reason":null}]} data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]} data: [DONE]

Multimodal Input (Vision)

Send images for model analysis:

response = client.chat.completions.create( model="openai/gpt-4o", messages=[{ "role": "user", "content": [ {"type": "text", "text": "What's in this image?"}, {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}} ] }] )

Models with vision capabilities include openai/gpt-4o, anthropic/claude-sonnet-4.5, google/gemini-3-flash-preview, and more. See the Vision guide for details.

Function Calling

See the Function Calling guide for details.

Last updated on