CheapTokens is beta/experimental software. Use at your own risk.

API Documentation

CheapTokens.ai sells discounted Venice.ai API credits with transparent, real-time pricing. You get a real Venice INFERENCE API key and call Venice's API directly. Full Venice API docs at docs.venice.ai.

How It Works

CheapTokens.ai sells discounted Venice.ai API credits using time-decay dynamic pricing. When you purchase credits, we generate a real Venice INFERENCE API key for you via Venice's key management API. You then call Venice's API directly — we are not a proxy.

Flow

  1. Buy credits on CheapTokens.ai (choose your credit amount)
  2. Receive a Venice API key with your credit usage limit
  3. Call the Venice API at api.venice.ai/api/v1
  4. Credits expire at midnight UTC. A key purchased on January 15th is valid until January 16th 00:00:00 UTC.

Authentication

Use the Venice API key you received from CheapTokens.ai in the Authorization header as a Bearer token.

Header
Authorization: Bearer your_venice_key

Your key has a credit usage limit set by your purchase. Venice tracks usage automatically. Credits expire at midnight UTC.

Base URL

Your key works directly with Venice's API. Use their OpenAI-compatible endpoint:

https://api.venice.ai/api/v1

If you are using the OpenAI SDK, set the base_url (Python) or baseURL (Node.js) configuration option. See the full Venice API docs for all available parameters.

Python
from openai import OpenAI client = OpenAI( api_key="your_venice_key", base_url="https://api.venice.ai/api/v1" )
Node.js
import OpenAI from 'openai'; const client = new OpenAI({ apiKey: 'your_venice_key', baseURL: 'https://api.venice.ai/api/v1', });

Chat Completions

Create a chat completion. The request format is identical to the OpenAI Chat Completions API.

cURL
curl https://api.venice.ai/api/v1/chat/completions \ -H "Authorization: Bearer your_venice_key" \ -H "Content-Type: application/json" \ -d '{ "model": "claude-sonnet-45", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the capital of France?"} ], "max_tokens": 256, "venice_parameters": {"disable_thinking": true} }'
Response
{ "id": "chatcmpl-abc123", "object": "chat.completion", "model": "claude-sonnet-45", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "The capital of France is Paris." }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 25, "completion_tokens": 8, "total_tokens": 33 } }

Models

Venice offers a variety of open-source and proprietary models. Use the models endpoint to list available options.

Request
GET https://api.venice.ai/api/v1/models Authorization: Bearer your_venice_key

Popular models include claude-sonnet-4-6, claude-opus-4-6, claude-sonnet-45, deepseek-v32, and others. See the full list at docs.venice.ai.

Streaming

Set "stream": true to receive responses as Server-Sent Events (SSE).

cURL Example
curl https://api.venice.ai/api/v1/chat/completions \ -H "Authorization: Bearer your_venice_key" \ -H "Content-Type: application/json" \ -d '{ "model": "claude-sonnet-45", "messages": [{"role": "user", "content": "Hello!"}], "stream": true, "venice_parameters": {"disable_thinking": true} }'

Rate Limits

Rate limits are managed by Venice based on your API key type. Your key has a credit usage limit set at purchase time.

LimitDetails
Credit usagePer your purchase
Key expiryMidnight UTC
Request limitsSet by Venice

Error Codes

Venice uses standard HTTP status codes. Error responses include a JSON body with details.

CodeDescription
401Invalid or missing API key
403Key expired or credit limit reached
429Rate limit exceeded
500Internal server error
Next: Getting Started →

Ready to get started?

Buy discounted Venice AI API credits and start making calls in minutes.

Buy Credits