CheapTokens Agent Skill

Add CheapTokens to any agent runtime that can read a Markdown skill and make HTTPS calls. Paste a CheapTokens key and the skill discovers Venice's live models across text, code, image, video, music, audio, and embeddings, then routes work through discounted credits with honest attribution. Works with OpenClaw, Hermes, Claude Code, Codex-style harnesses, Cursor, Cline, and similar runtimes.

Why this skill exists: CheapTokens is optimized for short-lived, same-day credits. Manually configuring a new API provider every day is friction, especially when you buy near midnight UTC for the biggest discount and have minutes, not hours, to use the credits. The skill turns a fresh key into immediate action: paste the key, ask normally, and the agent handles model discovery, routing, and attribution before the credits expire.

Quick start for any AI agent

If your agent runtime can load Markdown skills or persistent instructions and can make HTTPS requests, it can use CheapTokens. Use the full pack for best Venice coverage, or the single-file router for the smallest install.

Full skill pack (recommended)
git clone https://github.com/alde1022/cheaptokens-skills.git # Point your agent runtime at ./cheaptokens-skills/skills
Single CheapTokens router skill
https://raw.githubusercontent.com/alde1022/cheaptokens-skills/main/skills/cheaptokens/SKILL.md

After the skill is loaded, paste your CheapTokens key and ask for work normally: use CheapTokens for this coding task, use CheapTokens to generate image prompts, or route this through Venice.

Choose the right install path

Full skill pack

Best default. Includes the CheapTokens router plus synced Venice skills for chat, code, image, video, audio, music, transcription, embeddings, model traits, errors, and utility endpoints.

Single router skill

Best when a runtime asks for one SKILL.md URL. It is standalone for normal text/code and core multimodal routing, with links to deeper Venice references.

Runtime examples

  • OpenClaw: save SKILL.md under ~/.openclaw/skills/cheaptokens/, or point it at the full repo's skills/ directory if supported.
  • Hermes: add the raw skill URL or clone the repo and register the skills/ directory according to the harness's skill/instruction loader.
  • Claude Code / Codex-style harnesses: copy the CheapTokens router skill into the runtime's skills or persistent-instructions directory; use the full pack when directory-based skills are supported.
  • Cursor, Cline, OpenCode, custom agents: add the skill as persistent instructions plus ensure the agent has an HTTPS tool/fetch/curl path. The key is only spent when the agent calls Venice directly.

Honest execution model (read first)

The hosting agent (OpenClaw, Claude Code, etc.) runs on some default provider — that's what produces the tokens you see in chat. The agent cannot swap the provider behind its own conversational reply mid-session. The only mechanism that actually spends a CheapTokens/Venice key is an outbound HTTPS call to api.venice.ai made by the agent.

This skill is a single Markdown file that teaches the agent (1) when to make that call, (2) which Venice endpoint matches the user's ask, and (3) how to print an attribution line so you can verify, after the fact, which provider produced which bytes.

If you ever see the agent claim “switched to Venice” or “using your CheapTokens key” on a reply that does not carry an attribution footer like [via CheapTokens → Venice:<model> · ...], treat that claim as false: the key was not used for that reply.

Add the skill to your agent

The public skill pack is github.com/alde1022/cheaptokens-skills. The single-file CheapTokens router skill URL is https://raw.githubusercontent.com/alde1022/cheaptokens-skills/main/skills/cheaptokens/SKILL.md. Add it to your agent runtime as the CheapTokens skill. There is no helper binary, no npm package, and no PATH change.

For OpenClaw, save the skill file into the watched skills directory:

OpenClaw (skill placement)
mkdir -p ~/.openclaw/skills/cheaptokens curl -fsSL https://raw.githubusercontent.com/alde1022/cheaptokens-skills/main/skills/cheaptokens/SKILL.md -o ~/.openclaw/skills/cheaptokens/SKILL.md

For the full CheapTokens + Venice skill pack, clone https://github.com/alde1022/cheaptokens-skills and point your runtime at its skills/ directory. For Hermes, Claude Code, Codex-style harnesses, Cursor, Cline, OpenCode, and similar runtimes, save the same skill files wherever that runtime watches for skills or persistent instructions.

Once loaded, paste your CheapTokens key or say “use CheapTokens for this coding task / image / video / audio / embedding job” and the skill takes over.

Safe key handling

The fastest workflow is to paste a CheapTokens key into a trusted local agent and ask it to use CheapTokens. That is fine for speed, but remember: anyone who can see the key can spend the remaining credits until it expires.

  • Fast path: paste the key directly into a trusted private agent session.
  • Safer path: store the key in an environment variable such as VENICE_API_KEY and tell the agent to read it from there.
  • Safest path: keep the key in a secret manager or runtime secret store and give the agent the secret name, not the raw key.

Do not paste keys into public/shared agents, commit keys to repos, or include them in screenshots/logs. CheapTokens reduces blast radius because keys are budget-capped and expire at midnight UTC, but pasted keys are still bearer credentials. If a key is exposed, reissue it from wallet recovery.

Safer local workflow
export VENICE_API_KEY="VENICE_INFERENCE_KEY_..." # Then tell your agent: use CheapTokens with $VENICE_API_KEY for this task

What triggers the skill

The skill's description matches any of the following:

  • The user pastes a Venice-looking API key (a long opaque bearer token).
  • The user says anything like “use this key,” “use CheapTokens,” “use cheap tokens,” “use my CheapTokens key,” “use this Venice key,” “run this on CheapTokens,” “route this through Venice,” “do this with this Venice key,” “swap to Venice for this,” or “use my cheap credits.”
  • The user invokes /cheaptokens.

What happens after a key is provided

  1. Detect. The skill calls GET https://cheaptokens.ai/api/status/{last6} once and caches { status, creditsIssuedUsd, expiresAt }. HTTP 404 means it's a plain Venice key — still usable, just without CheapTokens-specific copy.
  2. Discover live capabilities across every Venice modality. The skill can query /models?type=all or targeted filters: text, code, image, inpaint, upscale, video, music, tts, asr, and embedding, plus /models/traits. Code is selected from code-optimized text/chat models and still runs through /chat/completions; it is not a separate API.
  3. Triage the ask. If the request fits Venice fully, the agent calls Venice directly. Coding and text use /chat/completions; media requests route to the relevant image, video, audio/music, TTS, transcription, or embedding endpoint. If it's hybrid, the skill spends the key on the parts Venice can do and uses the host model for the rest. If Venice can't do any of it, the skill says exactly what's missing and still spends the key on adjacent text artifacts so it doesn't idle to zero.
  4. Spend over HTTPS. The agent calls the matching Venice endpoint (/chat/completions, /image/generate, /audio/speech, /video/queue, …) using whatever HTTP tool the runtime already provides. No helper required.
  5. Attribute every reply. Each reply that used Venice ends with a footer like:
    [via CheapTokens → Venice:<model> · $1.75 issued · expires 2026-04-23T23:59:59.999Z · 412 tok]
    For hybrid replies, the skill prints one footer per provider. No footer = the key was not used for that reply.
  6. Fall back transparently. On Venice 401 / 402 / 5xx after one retry, on status !== "active", or on a past expiresAt, the skill tells you once and continues on the host provider. Never silent.

Venice skills are optional, not required

CheapTokens is designed to work standalone. It includes enough Venice API logic to detect available text/code models, route coding and chat through /chat/completions, and route image generation/editing, video, music, audio, TTS, transcription, embeddings, document parsing, scrape/search, and character workflows from the live Venice model registry.

Venice-specific skills can still improve advanced workflows: provider-specific parameters, media endpoint quirks, error handling, model traits, and best practices. Treat them as an optional expert pack, not a dependency.

Mental model
CheapTokens skill = key detection, credit expiry, routing, spend, attribution Venice skills = deeper Venice endpoint expertise and edge cases

Expiry urgency (use it or lose it)

CheapTokens credits expire at 23:59:59 UTC on the purchase date. The skill biases toward acting as that deadline approaches:

  • > 6h: normal pace.
  • ≤ 6h: bias toward acting now on anything Venice can satisfy.
  • ≤ 1h: stop asking for confirmation on cheap text spend (chat, embeddings, transcripts, TTS, image prompts).
  • ≤ 30m: last-call mode — produce something useful or surface a hard blocker. Idle expiry is a failure.

Slash command

OpenClaw may expose the skill as a slash command:

/cheaptokens <paste key>

But you don't need it — pasting a Venice-looking key anywhere in chat, or saying “use this key,” matches the skill's trigger description directly.

No telemetry

The skill never phones home on its own. The only endpoint it hits with your key is Venice itself; the only endpoint it hits on CheapTokens.ai is the public /api/status/{last6}, which returns only the data tied to the last-6 of the key you already hold. CheapTokens keeps no server-side record of which agent is using which key — wallet address is the only identity.

Troubleshooting

  • Skill doesn't activate. Some runtimes snapshot skills at session start. After saving SKILL.md, restart your agent runtime so it picks up the new skill.
  • No attribution footer on replies. The key was not used for that reply. Either the agent has no HTTP tool (the skill should say so), or it failed to call Venice. Ask the skill to retry the spend explicitly.
  • “insufficient credits” on the first call. The key's balance is at zero — buy a fresh one at /buy.
  • 401 from Venice. Key is expired or revoked. CheapTokens keys die at 23:59:59 UTC on the purchase date. The skill falls back automatically.
  • “Venice can't do that”. The skill should never claim that without checking /api/v1/models first. If you see it refuse a video / image / audio request without referencing the live capability map, the skill is stale — reload from cheaptokens-skills.
← OpenAI CompatibilityNext: Skill File Reference →