CheapTokens is beta/experimental software. Use at your own risk.

OpenAI Compatibility

Venice.ai exposes an OpenAI-compatible API. Any code, library, or tool that works with the OpenAI Chat Completions API can be pointed at Venice with minimal changes.

What's compatible

The Venice API supports the core OpenAI Chat Completions interface:

  • POST /chat/completions — create a chat completion
  • GET /models — list available models
  • Streaming via stream: true (Server-Sent Events)
  • Standard message roles: system, user, assistant
  • Parameters: model, messages, temperature, max_tokens, top_p, stop
Venice-specific parameters: Venice adds a venice_parameters object for features like disable_thinking. See docs.venice.ai for the full list.

Drop-in replacement examples

To switch from OpenAI to Venice, change two things: the base URL and the API key. The model parameter must also be a Venice-supported model ID.

Python (OpenAI SDK)
from openai import OpenAI # Change base_url and api_key — everything else stays the same client = OpenAI( base_url="https://api.venice.ai/api/v1", api_key="your_venice_key" ) response = client.chat.completions.create( model="claude-sonnet-45", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Explain quantum computing in one sentence."} ], temperature=0.7, max_tokens=256, extra_body={"venice_parameters": {"disable_thinking": True}} ) print(response.choices[0].message.content)
Node.js (OpenAI SDK)
import OpenAI from "openai"; const client = new OpenAI({ baseURL: "https://api.venice.ai/api/v1", apiKey: "your_venice_key", }); const response = await client.chat.completions.create({ model: "claude-sonnet-45", messages: [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "Explain quantum computing in one sentence." }, ], temperature: 0.7, max_tokens: 256, venice_parameters: { disable_thinking: true }, }); console.log(response.choices[0].message.content);
cURL
curl https://api.venice.ai/api/v1/chat/completions \ -H "Authorization: Bearer your_venice_key" \ -H "Content-Type: application/json" \ -d '{ "model": "claude-sonnet-45", "messages": [{"role": "user", "content": "Hello"}], "venice_parameters": {"disable_thinking": true} }'

Streaming

Streaming works identically to OpenAI. Set stream: true and iterate over the response chunks.

Python streaming
stream = client.chat.completions.create( model="claude-sonnet-45", messages=[{"role": "user", "content": "Write a haiku"}], stream=True, extra_body={"venice_parameters": {"disable_thinking": True}} ) for chunk in stream: content = chunk.choices[0].delta.content if content: print(content, end="")

Key differences from OpenAI

AspectOpenAIVenice (via CheapTokens)
Base URLapi.openai.com/v1api.venice.ai/api/v1
AuthOpenAI API keyVenice API key (from CheapTokens)
ModelsOpenAI models onlyAll Venice models (Claude, GPT, Gemini, DeepSeek, etc.)
Key expiryPermanent until revokedMidnight UTC (daily)
Extra paramsN/Avenice_parameters
← Getting StartedNext: OpenClaw Protocol →