OpenAI Compatibility

Venice.ai exposes an OpenAI-compatible API. Any code, library, or tool that works with the OpenAI Chat Completions API can be pointed at Venice with minimal changes.

What's compatible

The Venice API supports the core OpenAI Chat Completions interface:

POST /chat/completions — create a chat completion
GET /models — list available models
Streaming via stream: true (Server-Sent Events)
Standard message roles: system, user, assistant
Parameters: model, messages, temperature, max_tokens, top_p, stop

Venice-specific parameters: Venice adds a venice_parameters object for features like disable_thinking. See docs.venice.ai for the full list.

Drop-in replacement examples

To switch from OpenAI to Venice, change two things: the base URL and the API key. The model parameter must also be a Venice-supported model ID.

Python (OpenAI SDK)
from openai import OpenAI

# Change base_url and api_key — everything else stays the same
client = OpenAI(
    base_url="https://api.venice.ai/api/v1",
    api_key="your_venice_key"
)

response = client.chat.completions.create(
    model="claude-sonnet-45",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in one sentence."}
    ],
    temperature=0.7,
    max_tokens=256,
    extra_body={"venice_parameters": {"disable_thinking": True}}
)

print(response.choices[0].message.content)

Node.js (OpenAI SDK)
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.venice.ai/api/v1",
  apiKey: "your_venice_key",
});

const response = await client.chat.completions.create({
  model: "claude-sonnet-45",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "Explain quantum computing in one sentence." },
  ],
  temperature: 0.7,
  max_tokens: 256,
  venice_parameters: { disable_thinking: true },
});

console.log(response.choices[0].message.content);

cURL
curl https://api.venice.ai/api/v1/chat/completions \
  -H "Authorization: Bearer your_venice_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-45",
    "messages": [{"role": "user", "content": "Hello"}],
    "venice_parameters": {"disable_thinking": true}
  }'

Streaming

Streaming works identically to OpenAI. Set stream: true and iterate over the response chunks.

Python streaming
stream = client.chat.completions.create(
    model="claude-sonnet-45",
    messages=[{"role": "user", "content": "Write a haiku"}],
    stream=True,
    extra_body={"venice_parameters": {"disable_thinking": True}}
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="")

Key differences from OpenAI

Aspect	OpenAI	Venice (via CheapTokens)
Base URL	api.openai.com/v1	api.venice.ai/api/v1
Auth	OpenAI API key	Venice API key (from CheapTokens)
Models	OpenAI models only	All Venice models (Claude, GPT, Gemini, DeepSeek, etc.)
Key expiry	Permanent until revoked	Midnight UTC (daily)
Extra params	N/A	`venice_parameters`

← Getting Started Next: Agent Skills →