API Documentation
AITokenPass issues real Venice AI API keys. Use them directly with Venice's OpenAI-compatible API — no proxy, no middleware.
How It Works
AITokenPass is a marketplace for discounted Venice AI API credits. When you purchase credits, we generate a real Venice INFERENCE API key for you via Venice's key management API. You then call Venice's API directly — we are not a proxy.
Flow
- Buy credits on AITokenPass (choose diem/day and dates)
- Receive a Venice API key with your diem consumption limit
- Call Venice's API at
api.venice.ai - Key auto-expires at the end of your purchased dates
Authentication
Use the Venice API key you received from AITokenPass in the Authorization header as a Bearer token.
Authorization: Bearer YOUR_VENICE_API_KEYYour key has a diem consumption limit set by your purchase. Venice tracks usage automatically. The key expires at the end of your last purchased date.
Base URL
All API calls go directly to Venice. This is not an AITokenPass URL — you call Venice directly.
https://api.venice.ai/api/v1If you are using the OpenAI SDK, set the base_url (Python) or baseURL (Node.js) configuration option.
from openai import OpenAI
client = OpenAI(
api_key="YOUR_VENICE_API_KEY",
base_url="https://api.venice.ai/api/v1"
)import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'YOUR_VENICE_API_KEY',
baseURL: 'https://api.venice.ai/api/v1',
});Chat Completions
Create a chat completion. The request format is identical to the OpenAI Chat Completions API.
curl https://api.venice.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_VENICE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3.3-70b",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
],
"temperature": 0.7,
"max_tokens": 256
}'{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"model": "llama-3.3-70b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 8,
"total_tokens": 33
}
}Models
Venice offers a variety of open-source and proprietary models. Use the models endpoint to list available options.
GET https://api.venice.ai/api/v1/models
Authorization: Bearer YOUR_VENICE_API_KEYPopular models include llama-3.3-70b, deepseek-r1-671b, and others. Check Venice's documentation for the full list.
Streaming
Set "stream": true to receive responses as Server-Sent Events (SSE).
curl https://api.venice.ai/api/v1/chat/completions \
-H "Authorization: Bearer YOUR_VENICE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3.3-70b",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": true
}'Rate Limits
Rate limits are managed by Venice based on your API key type. Your key has a diem consumption limit set at purchase time.
| Limit | Details |
|---|---|
| Diem consumption | Per your purchase |
| Key expiry | End of last purchased date |
| Request limits | Set by Venice |
Error Codes
Venice uses standard HTTP status codes. Error responses include a JSON body with details.
| Code | Description |
|---|---|
| 401 | Invalid or missing API key |
| 403 | Key expired or consumption limit reached |
| 429 | Rate limit exceeded |
| 500 | Internal server error |
Ready to get started?
Buy discounted Venice AI API credits and start making calls in minutes.
Buy Credits