Skip to Content
開發文件進階指南串流回應

串流回應

串流回應(Streaming)允許你在模型生成過程中即時接收輸出,提升使用者體驗和感知速度。

運作原理

OfoxAI 使用 Server-Sent Events (SSE) 協議實作串流回應:

  1. 用戶端傳送請求時設定 stream: true
  2. 伺服器逐步回傳生成的內容片段(chunk)
  3. 每個 chunk 以 data: 前綴透過 SSE 傳送
  4. 生成結束時傳送 data: [DONE]

OpenAI 協議串流

Terminal
curl https://api.ofox.ai/v1/chat/completions \ -H "Authorization: Bearer $OFOX_API_KEY" \ -H "Content-Type: application/json" \ -N \ -d '{ "model": "openai/gpt-4o", "messages": [{"role": "user", "content": "寫一首關於程式設計的詩"}], "stream": true }'

Anthropic 協議串流

stream_anthropic.py
import anthropic client = anthropic.Anthropic( base_url="https://api.ofox.ai/anthropic", api_key="<你的 OFOXAI_API_KEY>" ) with client.messages.stream( model="anthropic/claude-sonnet-4.5", max_tokens=1024, messages=[{"role": "user", "content": "寫一首關於程式設計的詩"}] ) as stream: for text in stream.text_stream: print(text, end="", flush=True)

串流 + Function Calling

串流回應也支援函式呼叫場景。模型會先串流輸出工具呼叫請求,你處理完成後繼續對話:

stream_with_tools.py
stream = client.chat.completions.create( model="openai/gpt-4o", messages=[{"role": "user", "content": "今天北京天氣怎麼樣?"}], tools=[{ "type": "function", "function": { "name": "get_weather", "description": "取得指定城市的天氣", "parameters": { "type": "object", "properties": { "city": {"type": "string", "description": "城市名稱"} }, "required": ["city"] } } }], stream=True ) for chunk in stream: delta = chunk.choices[0].delta if delta.tool_calls: # 處理工具呼叫 print(f"呼叫工具: {delta.tool_calls[0].function}") elif delta.content: print(delta.content, end="", flush=True)

錯誤處理和重連

串流連線可能因網路問題中斷。建議實作重連邏輯。

stream_retry.py
import time def stream_with_retry(client, max_retries=3, **kwargs): for attempt in range(max_retries): try: stream = client.chat.completions.create(stream=True, **kwargs) for chunk in stream: yield chunk return # 成功完成 except Exception as e: if attempt < max_retries - 1: wait = 2 ** attempt # 指數退避 print(f"\n連線中斷,{wait}s 後重試...") time.sleep(wait) else: raise e

最佳實踐

  1. 始終設定逾時 — 避免無限等待
  2. 處理不完整的 chunk — 某些 chunk 可能沒有 content
  3. 實作重連機制 — 使用指數退避策略
  4. 前端使用 flush — 確保內容即時顯示
Last updated on