OpenAI Compatibility
Venice.ai exposes an OpenAI-compatible API. Any code, library, or tool that works with the OpenAI Chat Completions API can be pointed at Venice with minimal changes.
What's compatible
The Venice API supports the core OpenAI Chat Completions interface:
POST /chat/completions — create a chat completionGET /models — list available models- Streaming via
stream: true (Server-Sent Events) - Standard message roles:
system, user, assistant - Parameters:
model, messages, temperature, max_tokens, top_p, stop
Venice-specific parameters: Venice adds a
venice_parameters object for features like
disable_thinking. See
docs.venice.ai for the full list.
Drop-in replacement examples
To switch from OpenAI to Venice, change two things: the base URL and the API key. The model parameter must also be a Venice-supported model ID.
from openai import OpenAI
# Change base_url and api_key — everything else stays the same
client = OpenAI(
base_url="https://api.venice.ai/api/v1",
api_key="your_venice_key"
)
response = client.chat.completions.create(
model="claude-sonnet-45",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in one sentence."}
],
temperature=0.7,
max_tokens=256,
extra_body={"venice_parameters": {"disable_thinking": True}}
)
print(response.choices[0].message.content)
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.venice.ai/api/v1",
apiKey: "your_venice_key",
});
const response = await client.chat.completions.create({
model: "claude-sonnet-45",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Explain quantum computing in one sentence." },
],
temperature: 0.7,
max_tokens: 256,
venice_parameters: { disable_thinking: true },
});
console.log(response.choices[0].message.content);
curl https://api.venice.ai/api/v1/chat/completions \
-H "Authorization: Bearer your_venice_key" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-45",
"messages": [{"role": "user", "content": "Hello"}],
"venice_parameters": {"disable_thinking": true}
}'
Streaming
Streaming works identically to OpenAI. Set stream: true and iterate over the response chunks.
stream = client.chat.completions.create(
model="claude-sonnet-45",
messages=[{"role": "user", "content": "Write a haiku"}],
stream=True,
extra_body={"venice_parameters": {"disable_thinking": True}}
)
for chunk in stream:
content = chunk.choices[0].delta.content
if content:
print(content, end="")
Key differences from OpenAI
| Aspect | OpenAI | Venice (via CheapTokens) |
|---|
| Base URL | api.openai.com/v1 | api.venice.ai/api/v1 |
| Auth | OpenAI API key | Venice API key (from CheapTokens) |
| Models | OpenAI models only | All Venice models (Claude, GPT, Gemini, DeepSeek, etc.) |
| Key expiry | Permanent until revoked | Midnight UTC (daily) |
| Extra params | N/A | venice_parameters |