How to Pay for AI API Calls in USDC — No Account, No API Key, No Subscription

The shortest path from a wallet to a Claude completion is one HTTP request.

Every other LLM gateway makes you do three things before you can call a model:

Create an account
Pay into a credit balance with a card or bank transfer
Generate and rotate an API key

That's fine for a developer at a desk. It's a wall for an autonomous agent — and increasingly, it's friction even for humans who just want to write curl and get an answer.

There's a faster path. Pay for each call directly with USDC. No account, no balance, no key. Just a wallet, a header, and the request goes through.

This post is the 5-minute tutorial.

What You Need

A wallet with USDC on Base or Solana. Coinbase Wallet, Phantom, MetaMask, or anything that can sign.
A few cents. Most calls cost between $0.0001 and $0.05. The minimum charge is $0.001 per request.
curl, Python, or TypeScript. Or any HTTP client.

That's the entire prerequisite list. No sign-up form, no email verification, no billing setup.

The Protocol: x402

The mechanism is a 1990s-era HTTP status code that was reserved but never standardized — 402 Payment Required. Coinbase, Cloudflare, and the Linux Foundation revived it in 2025 as the basis for x402, an open protocol where:

You call an endpoint
The server returns 402 with a price quote in a header
You sign a payment authorization with your wallet
You re-send the request with the signature in X-PAYMENT
The server settles the payment on-chain and returns the result

It's one extra round trip on the first call. From there, every subsequent call is a single request.

Method 1: Pay Per Call with `curl`

This is the lowest-level version. You'll see the full handshake.

Step 1 — Probe the price

curl -i https://blockrun.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-opus-4.7",
    "messages": [{"role": "user", "content": "Explain x402 in one sentence."}]
  }'

You'll get back HTTP/1.1 402 Payment Required with a WWW-Authenticate header describing the price (in USDC, on Base or Solana) and the recipient address.

Step 2 — Sign and re-send

The X-PAYMENT header carries a wallet signature authorizing the exact amount to the exact recipient. Any wallet library can produce it — viem on Base, @solana/web3.js on Solana, or the official x402 SDK.

curl https://blockrun.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-PAYMENT: <base64-encoded EIP-712 signature>" \
  -d '{
    "model": "anthropic/claude-opus-4.7",
    "messages": [{"role": "user", "content": "Explain x402 in one sentence."}]
  }'

The gateway verifies the signature, settles the payment on Base (or Solana), forwards the request to Anthropic, and streams the response. The on-chain receipt is your invoice.

Method 2: TypeScript with the x402 Client

If you don't want to handle the handshake yourself, the x402 client wraps it:

import { wrapFetchWithPayment } from "x402-fetch";
import { privateKeyToAccount } from "viem/accounts";

const account = privateKeyToAccount(process.env.WALLET_KEY as `0x${string}`);
const fetchWithPayment = wrapFetchWithPayment(fetch, account);

const res = await fetchWithPayment("https://blockrun.ai/v1/chat/completions", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    model: "openai/gpt-5",
    messages: [{ role: "user", content: "What's the cheapest LLM today?" }],
  }),
});

const data = await res.json();
console.log(data.choices[0].message.content);

That's the full integration. wrapFetchWithPayment handles the 402 retry, signs the payment, and returns the final response. No SDK to learn beyond your existing fetch.

Method 3: Python

The Python flow is symmetric:

from x402.clients.requests import x402_requests
from eth_account import Account

wallet = Account.from_key(os.environ["WALLET_KEY"])
session = x402_requests(wallet)

res = session.post(
    "https://blockrun.ai/v1/chat/completions",
    json={
        "model": "deepseek/deepseek-v4-pro",
        "messages": [{"role": "user", "content": "Compare USDC payment rails."}],
    },
)

print(res.json()["choices"][0]["message"]["content"])

The wallet pays per call. Your USDC balance goes down. Your code never touches a billing dashboard.

What This Actually Costs

The minimum per-request charge is $0.001. Above that, you pay the model's native rate plus a small gateway fee — typically 1-3% of the underlying cost. There is no subscription, no monthly minimum, no inactive-account fee.

A few example calls (May 2026 prices):

Model	Input	Output	Typical call cost
`deepseek/deepseek-v4-pro`	$0.27/M	$1.10/M	$0.0005 – $0.005
`anthropic/claude-haiku-4.5`	$1.00/M	$5.00/M	$0.002 – $0.02
`openai/gpt-5`	$1.25/M	$10/M	$0.003 – $0.05
`anthropic/claude-opus-4.7`	$15/M	$75/M	$0.05 – $1.00

You only pay for the tokens you use. The wallet's USDC balance is your hard ceiling — when it hits zero, the gateway stops accepting calls. No overdraft, no surprise invoice, no $47K loop.

Why USDC, and Why Per-Call

USDC because it settles in seconds, is fully reserve-backed, and works on Base and Solana — both of which support sub-cent payments without onerous gas fees. Stablecoin = no exchange-rate volatility between the moment you authorize and the moment the call lands.

Per-call because that's the natural unit of LLM consumption. You don't subscribe to a CDN per month — you pay per byte. LLM inference is the same shape: per token, per request. Subscription pricing exists because credit card rails can't economically settle a $0.001 transaction. Stablecoin rails can.

Together, those two choices remove every reason to maintain a balance, an account, or a key.

When Per-Call Beats Subscription

Bursty workloads. You spend $200 one day, $0 the next. Subscriptions force you to pay for the peak as if it were the average.
Multi-model evaluation. Trying GPT-5 vs Claude vs DeepSeek vs Gemini for a benchmark? With per-call you swap models in a string, with subscriptions you sign up four times.
Autonomous agents. An agent can't fill out a Stripe form. It can sign an EIP-712 message.
One-off scripts. You needed a model once, for one task. Why are you maintaining an account?

When subscription still wins: a fleet of human users hammering a single model 24/7. There the volume amortizes the lock-in. For everything else, pay-per-call in USDC is strictly faster, strictly cheaper, and strictly less paperwork.

Going Live

The BlockRun gateway is at blockrun.ai/v1. It's OpenAI-compatible — every endpoint maps to the OpenAI SDK shape — and supports x402 on Base (blockrun.ai) and Solana (sol.blockrun.ai).

Pick a model from the catalog. Top up a wallet with $5 of USDC. Send the request. Watch it land on-chain.

The internet wasn't built for machines to pay. We're building the layer that is — and it works for humans too.