How to Pay for AI API Calls in USDC — No Account, No API Key, No Subscription

The shortest path from a wallet to a Claude completion is one HTTP request.
Every other LLM gateway makes you do three things before you can call a model:
- Create an account
- Pay into a credit balance with a card or bank transfer
- Generate and rotate an API key
That's fine for a developer at a desk. It's a wall for an autonomous agent — and increasingly, it's friction even for humans who just want to write curl and get an answer.
There's a faster path. Pay for each call directly with USDC. No account, no balance, no key. Just a wallet, a header, and the request goes through.
This post is the 5-minute tutorial.
What You Need
- A wallet with USDC on Base or Solana. Coinbase Wallet, Phantom, MetaMask, or anything that can sign.
- A few cents. Most calls cost between $0.0001 and $0.05. The minimum charge is $0.001 per request.
curl, Python, or TypeScript. Or any HTTP client.
That's the entire prerequisite list. No sign-up form, no email verification, no billing setup.
The Protocol: x402
The mechanism is a 1990s-era HTTP status code that was reserved but never standardized — 402 Payment Required. Coinbase, Cloudflare, and the Linux Foundation revived it in 2025 as the basis for x402, an open protocol where:
- You call an endpoint
- The server returns
402with a price quote in a header - You sign a payment authorization with your wallet
- You re-send the request with the signature in
X-PAYMENT - The server settles the payment on-chain and returns the result
It's one extra round trip on the first call. From there, every subsequent call is a single request.
Method 1: Pay Per Call with curl
This is the lowest-level version. You'll see the full handshake.
Step 1 — Probe the price
curl -i https://blockrun.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-opus-4.7",
"messages": [{"role": "user", "content": "Explain x402 in one sentence."}]
}'
You'll get back HTTP/1.1 402 Payment Required with a WWW-Authenticate header describing the price (in USDC, on Base or Solana) and the recipient address.
Step 2 — Sign and re-send
The X-PAYMENT header carries a wallet signature authorizing the exact amount to the exact recipient. Any wallet library can produce it — viem on Base, @solana/web3.js on Solana, or the official x402 SDK.
curl https://blockrun.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "X-PAYMENT: <base64-encoded EIP-712 signature>" \
-d '{
"model": "anthropic/claude-opus-4.7",
"messages": [{"role": "user", "content": "Explain x402 in one sentence."}]
}'
The gateway verifies the signature, settles the payment on Base (or Solana), forwards the request to Anthropic, and streams the response. The on-chain receipt is your invoice.
Method 2: TypeScript with the x402 Client
If you don't want to handle the handshake yourself, the x402 client wraps it:
import { wrapFetchWithPayment } from "x402-fetch";
import { privateKeyToAccount } from "viem/accounts";
const account = privateKeyToAccount(process.env.WALLET_KEY as `0x${string}`);
const fetchWithPayment = wrapFetchWithPayment(fetch, account);
const res = await fetchWithPayment("https://blockrun.ai/v1/chat/completions", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
model: "openai/gpt-5",
messages: [{ role: "user", content: "What's the cheapest LLM today?" }],
}),
});
const data = await res.json();
console.log(data.choices[0].message.content);
That's the full integration. wrapFetchWithPayment handles the 402 retry, signs the payment, and returns the final response. No SDK to learn beyond your existing fetch.
Method 3: Python
The Python flow is symmetric:
from x402.clients.requests import x402_requests
from eth_account import Account
wallet = Account.from_key(os.environ["WALLET_KEY"])
session = x402_requests(wallet)
res = session.post(
"https://blockrun.ai/v1/chat/completions",
json={
"model": "deepseek/deepseek-v4-pro",
"messages": [{"role": "user", "content": "Compare USDC payment rails."}],
},
)
print(res.json()["choices"][0]["message"]["content"])
The wallet pays per call. Your USDC balance goes down. Your code never touches a billing dashboard.
What This Actually Costs
The minimum per-request charge is $0.001. Above that, you pay the model's native rate plus a small gateway fee — typically 1-3% of the underlying cost. There is no subscription, no monthly minimum, no inactive-account fee.
A few example calls (May 2026 prices):
| Model | Input | Output | Typical call cost |
|---|---|---|---|
deepseek/deepseek-v4-pro | $0.27/M | $1.10/M | $0.0005 – $0.005 |
anthropic/claude-haiku-4.5 | $1.00/M | $5.00/M | $0.002 – $0.02 |
openai/gpt-5 | $1.25/M | $10/M | $0.003 – $0.05 |
anthropic/claude-opus-4.7 | $15/M | $75/M | $0.05 – $1.00 |
You only pay for the tokens you use. The wallet's USDC balance is your hard ceiling — when it hits zero, the gateway stops accepting calls. No overdraft, no surprise invoice, no $47K loop.
Why USDC, and Why Per-Call
USDC because it settles in seconds, is fully reserve-backed, and works on Base and Solana — both of which support sub-cent payments without onerous gas fees. Stablecoin = no exchange-rate volatility between the moment you authorize and the moment the call lands.
Per-call because that's the natural unit of LLM consumption. You don't subscribe to a CDN per month — you pay per byte. LLM inference is the same shape: per token, per request. Subscription pricing exists because credit card rails can't economically settle a $0.001 transaction. Stablecoin rails can.
Together, those two choices remove every reason to maintain a balance, an account, or a key.
When Per-Call Beats Subscription
- Bursty workloads. You spend $200 one day, $0 the next. Subscriptions force you to pay for the peak as if it were the average.
- Multi-model evaluation. Trying GPT-5 vs Claude vs DeepSeek vs Gemini for a benchmark? With per-call you swap models in a string, with subscriptions you sign up four times.
- Autonomous agents. An agent can't fill out a Stripe form. It can sign an EIP-712 message.
- One-off scripts. You needed a model once, for one task. Why are you maintaining an account?
When subscription still wins: a fleet of human users hammering a single model 24/7. There the volume amortizes the lock-in. For everything else, pay-per-call in USDC is strictly faster, strictly cheaper, and strictly less paperwork.
Going Live
The BlockRun gateway is at blockrun.ai/v1. It's OpenAI-compatible — every endpoint maps to the OpenAI SDK shape — and supports x402 on Base (blockrun.ai) and Solana (sol.blockrun.ai).
Pick a model from the catalog. Top up a wallet with $5 of USDC. Send the request. Watch it land on-chain.
The internet wasn't built for machines to pay. We're building the layer that is — and it works for humans too.