The State of AI Agent Payments — Q1 2026 Industry Report

BlockRun Research · Volume I · May 2026

Autonomous software is now paying for its own operations at scale. In the first quarter of 2026, BlockRun.ai's multi-chain gateway routed approximately 2.6 million API calls on behalf of more than 1,500 autonomous agents distributed across four continents, settling hundreds of thousands of on-chain USDC transactions on Base and Solana via the x402 protocol. This report describes, from production data, the structural shape of that traffic — and what it implies for the payment-commerce category being built underneath the agent economy.

Observation window: Feb 1 – Apr 30, 2026 · 41+ models routed · multi-chain settlement (Base + Solana) · per-request stablecoin payments via x402.

Metric	Q1 2026
API calls routed	~2.6M
Paid settlements	~590K (on-chain USDC)
Paying agents	1,500+ unique autonomous wallets
Cost reduction vs all-frontier baseline	89%

The behavioral signatures documented here are the first systematic empirical description of a payment-commerce category whose buyer is software, not human. The five findings that follow are not preliminary observations. They are structural properties of the agent economy, surfaced from on-chain telemetry and verifiable by any third party.

74% of all paid API calls route to free-tier models. The bulk of agent inference is structurally not a paid token market. This is a defining feature of the category — not a bug, not a transitional state.
97% of dollar volume comes from wallets with sub-minute median inter-arrival time. The unmistakable temporal signature of autonomous software. No human-initiated workflow produces it at sustained scale.
The median single payment is $0.0175. The largest single payment in 89 days was $3.59. Legacy payment rails (Stripe, ACH, cards) cannot serve this distribution. Per-call settlement at micropayment sizes requires stablecoin programmability — there is no other path.
The most common message in the paid traffic stream is a heartbeat probe returning two tokens. The dominant unit of agent commerce is the privilege of remaining vigilant. Substantive work is the long tail.
Wallet-level retention behaves like infrastructure, not SaaS. Agent-driven wallets pay continuously for the operational lifespan of the underlying process — typically months — and cease only when the process itself terminates. The relevant time-constant is not user engagement; it is software uptime.

Behavioral analysis is computed on the on-chain settlement subset where per-transaction telemetry is exhaustive, supplemented by a stratified random sample of 17,989 API call logs with full prompt, model, and token telemetry drawn from one anchor day per week across the observation window. All wallet addresses are anonymized; system prompts are paraphrased to remove operator-identifying details. Numerical claims reflect aggregate dataset statistics.

§ 1 — The shape of paid demand

The distribution of single-payment sizes is the first observation that distinguishes agent-driven payment traffic from any payment category designed for human-payable transactions.

Figure 1 · Payment-size distribution (Q1 2026, default Base treasury). 51% of payments fall in $0.01–$0.05; 46% of revenue falls in $0.05–$0.25.

Bucket	% of transactions	% of revenue
< $0.001 (sub-tenth-cent)	0.33%	0.00%
$0.001 – $0.01 (sub-cent)	23.58%	1.72%
$0.01 – $0.05	50.90%	18.39%
$0.05 – $0.25	20.48%	45.51%
$0.25 – $1	4.60%	31.74%
$1 – $10	0.11%	2.65%
$10+	0.00%	0.00%

By transaction count, the modal bucket is 1–5¢. By dollar weight, the modal bucket is 5–25¢. 24% of all transactions are sub-cent. Just 0.11% of transactions exceed one dollar. The largest single payment in 89 days was $3.59. The median is 1.75 cents.

The dollar weight of agent payment traffic lives in the cents-to-quarter-dollar range — small enough that legacy payment rails cannot serve it (Stripe's per-transaction minimum fee of ~$0.30 exceeds ~74% of all transactions in this distribution), but large enough that aggregate volume from cumulative paid traffic is material.

§ 2 — Temporal behavior: the cron fingerprint

Every paid transaction in the dataset is timestamped to the second. Aggregating all Q1 settlements by minute-of-the-hour reveals a strong concentration around clock boundaries that does not appear in human-initiated payment streams.

Paid transactions by minute-of-hour (UTC). Top-of-minute is 2.4× as busy as a uniform random baseline; the 30-minute mark spikes to 1.5×.

Figure 2 · Paid transactions, by minute-of-hour (UTC). Top-of-minute is 2.4× as busy as a uniform random baseline.

If payments were uniformly distributed, each minute would receive 1.67%. Minute :00 receives 4.06% (2.4× baseline). Minutes :01–:05 stay 1.06–1.72× above baseline — the post-cron burst tail. Minute :30 receives 1.5× baseline (half-hour-aligned schedules).

Twenty-four times baseline at the top of every minute is not how human-initiated payment traffic behaves.

§ 3 — The free/paid split

The most contrarian observation in this dataset concerns the routing destination of paid API calls.

Free vs paid endpoints, share of all routed traffic. Free endpoints absorb 74.35% of calls; paid endpoints 25.65%.

Figure 3a · Call-count distribution: free vs paid endpoints. 74% of all agent inference traffic routes to zero-cost endpoints.

Free endpoints are NVIDIA-hosted (gpt-oss-120b, nemotron-ultra-253b, qwen3-next-80b-thinking, qwen3-coder-480b, deepseek-v3.2) and Z.AI's flat-billed glm-5.1. Paid endpoints are per-token-priced models from Anthropic, OpenAI, Google, xAI, Moonshot, MiniMax, and DeepSeek.

Figure 3b · Top 12 models by call count. The five largest models in production are all free-tier.

Model	Share	Tier
nvidia/gpt-oss-120b	60.23%	free
xai/grok-code-fast-1	7.02%	paid
nvidia/nemotron-ultra-253b	6.84%	free
moonshot/kimi-k2.5	3.59%	paid
anthropic/claude-sonnet-4.6	2.84%	paid
google/gemini-2.5-flash-lite	2.16%	paid
xai/grok-4-fast-reasoning	1.44%	paid
nvidia/qwen3-next-80b-a3b-thinking	1.30%	free
anthropic/claude-sonnet-4	1.28%	paid
zai/glm-5.1	1.21%	free
nvidia/deepseek-v3.2	0.98%	free
google/gemini-3.1-flash-lite	0.93%	paid

Among the 12 most-frequently-routed models, the top five are free. Claude Sonnet 4.6 handles 2.8% of all sampled calls. The two largest paid models by share are xai/grok-code-fast-1 ($0.20/$1.50 per million) and moonshot/kimi-k2.5 ($0.60/$3.00 per million).

The widely-held assumption is that AI agent inference represents a fast-growing paid market for frontier model providers. The data contradicts that on a per-call basis: the bulk of agent inference workload — high-frequency scanning, JSON extraction, routine cron probes, structured output — routes to zero-cost endpoints.

§ 4 — Cadence as a software signature

The cleanest behavioral cut in the dataset is the median inter-arrival time (IAT) between consecutive paid calls per wallet.

Median inter-arrival time per wallet, weighted by revenue. 60% of wallets and 82% of revenue concentrate in the 6–30 second bucket.

Figure 4 · Median inter-arrival time per wallet, weighted by share of revenue. 97% of paid Q1 revenue comes from wallets with sub-minute median IAT.

The 6–30 second cluster alone contains 60% of paying wallets and accounts for 81.6% of revenue. Combined with the <6s and 30s–1min buckets, sub-minute IAT wallets are 76% of the population and 97% of the revenue.

Sub-minute median inter-arrival time, sustained, is an unambiguous signature of autonomous software.

This single diagnostic resolves the "who is paying" question without requiring prompt analysis, model selection inspection, or any other higher-level inference. The dollar volume is overwhelmingly being produced by software, not humans.

§ 5 — Workspace and integration profile

The typical production agent on the platform is not a chat interface — it is a long-running, file-stateful, message-bridged process.

Figure 5 · Integration footprint among top-100 most-active paying wallets. 86% include monitoring patterns. 85% include workspace state. 70% touch git.

Integration	% of top-100 wallets
monitoring (heartbeat / status probes)	86%
workspace (persistent filesystem state)	85%
git	70%
email	62%
cron	61%
web_browsing	59%
deploy	53%
twitter	52%
trading	41%
discord	30%

"Monitoring" denotes a recurring pattern in which the agent reads a small status file from its workspace on each cron iteration and replies with an acknowledgement token if no work is pending. "Workspace" denotes persistent filesystem state. Many wallets exhibit multiple integrations simultaneously; values are not mutually exclusive.

The single most common message in the paid traffic stream by count is a heartbeat probe: a short prompt instructing the agent to check a status file in its workspace and either execute pending work or return a fixed acknowledgement string. The plurality of these probes return the acknowledgement: the agent had no work to do and exits silently. The dominant unit of agent commerce, by call count, is the privilege of remaining vigilant. The substantive work is the long tail.

§ 6 — Use-case archetypes

Classifying paying wallets in the on-chain subset by their cadence and call-pattern signatures yields the following revenue distribution across behavioral archetypes.

Figure 6 · Q1 revenue share by behavioral archetype. Roughly 90% of paid revenue concentrates in three autonomous-agent archetypes.

Archetype	Revenue share	Wallets
autonomous agent	59.9%	375
coding agent	19.6%	81
reasoning agent	10.5%	17
scanner	2.3%	9
unknown	2.0%	154
extraction	1.6%	29
reasoning	1.4%	17
cron agent	1.1%	23
routine	0.9%	18
one shot	0.4%	164

Archetypes are computed from on-chain cadence signals only — median IAT, payment-size distribution, active-day span, and per-day call volume — without prompt inspection. "Autonomous agent": sub-minute IAT, mixed paid models. "Coding agent": sub-minute IAT, grok-code-fast-1 dominant. "Reasoning agent": sub-minute IAT, Sonnet or Opus dominant.

The dollar share split is striking: autonomous-agent workloads (sub-minute IAT, mixed paid models) contribute approximately 60% of revenue; coding agents and reasoning agents together contribute another 30%. A payment-infrastructure platform serving this dataset is, in dollar terms, primarily serving sustained-software agents.

§ 7 — The cost-reduction implication

The structural dominance of free-tier routing combined with cost-density concentration on a small share of paid calls creates dramatic cost-side leverage for any user routing through the platform.

For a 10,000-request monthly workload at real production token volumes (averaging ~25,000 input tokens and ~230 output tokens per call):

Estimated inference cost for a 10,000-request mixed-workload month. All-Sonnet baseline $776; ClawRouter production routing $97.10; with compression $87.39; with cache $82.53.

Figure 7 · Estimated inference cost for a 10,000-request mixed-workload month. All-Sonnet baseline: $776 · ClawRouter production routing: $83.

Configuration	$ / 10K mixed requests
Direct Claude Opus	$1,292.00
Direct Claude Sonnet	$776.00
ClawRouter (paid subtotal)	$97.10
ClawRouter (+ compression)	$87.39
ClawRouter (+ cache)	$82.53

"ClawRouter production routing" reflects the actual model mix observed in production (74% free-tier, 26% paid). Compression refers to multi-layer token-compression; cache refers to local response caching for repeated requests. The 89% reduction reflects the average outcome across production traffic.

For users whose workloads are unlike the production aggregate (e.g., 100% complex reasoning that all genuinely needs Sonnet), savings will be lower. The 89% figure is the average outcome across the production population.

§ 8 — Category outlook

The data establishes that agent payments are now a distinct category of commerce — measurable, characterizable, and structurally incompatible with the payment infrastructure that preceded it. It is not enterprise SaaS. It is not consumer subscription. It is not crypto remittance. It is not Web2 micropayment. It is something new, with its own retention dynamics, its own cost structure, and its own scaling laws.

Market sizing

Total annual spend on LLM-style APIs was on the order of $20 billion in 2025 and is growing at roughly 100% year-over-year. By 2028, the agent-driven share will plausibly reach $100 to $200 billion in annual programmatic AI spend — the majority concentrated in workloads with the behavioral signatures documented in this report: long-running, software-paying-software, sub-cent-to-low-dollar payment sizes, sustained sub-minute cadence.

The share of that programmatic spend routed through per-call stablecoin settlement is small today and structurally favored to grow rapidly. The drivers are unambiguous: autonomous software lacks the human accountable party legacy payment rails require; existing card-network minimums exclude micropayment traffic by construction; agents need programmable per-action settlement to coordinate sub-agent and agent-to-agent commerce. The addressable settlement-volume range by 2030 is in the tens of billions of dollars annually. The platforms that will capture that volume are being built today.

Two forward patterns already visible in the data

Sub-agent payment flows. Multi-agent compositions — where a parent agent's wallet funds the inference costs of dependent child agents — already appear in the data and are growing as a share of total throughput. Every major agent harness shipping today will standardize sub-agent spawning over the next twelve months. The per-call payment volume from sub-agent traffic will become the dominant component of platform throughput before the end of 2027.

Agent-to-agent commerce. The next layer is one agent paying another agent for a specialist service — not an LLM call but a research task, a sourcing job, a verification step. The current settlement primitive supports this natively. It will become a first-class operation as harness tooling matures over the next eighteen months, and it is the substrate on which the next decade of autonomous software-to-software commerce will run.

The structural asymptote: every autonomous software action with marginal cost has a wallet.

Inference. Search. Data lookups. Trade execution. Storage. Compute. Specialist agent work. The thing that gets paid is the action, not the API. The thing that pays is the agent, not the operator. The settlement primitive is stablecoin, per-call, programmable.

BlockRun.ai's position

BlockRun.ai is the payment substrate for the autonomous agent economy. The platform was designed for this category from inception: multi-chain settlement spanning Base and Solana, native x402 protocol implementation, automatic routing across 41+ models with a 14-dimension local scoring algorithm executing in under a millisecond, and the OpenClaw agent harness shipped as default-installed distribution to operators worldwide. The Q1 2026 dataset documented here is the empirical confirmation of that design hypothesis. The next phase is scale.

§ 9 — Conclusion

This report documents the first three months of a payment-commerce category that did not exist eighteen months ago and will dominate the next decade. Sub-cent payment sizes, cron-aligned temporal bursts, free-tier-dominant routing, wallet-as-identity retention, and sustained sub-minute paid-API cadence are not behaviors that appear in any human-payable category. They are signatures of autonomous software paying for its own operations — at scale, in production, on programmable rails, today.

The infrastructure that will run this category will not be retrofitted from Web2 payment networks. It will not be borrowed from consumer crypto. It will be purpose-built for a buyer that does not sleep, does not have a credit card, does not wait for monthly invoicing, and does not exist in any regulatory framework designed before 2024. BlockRun.ai is building that infrastructure.

Software is no longer waiting for permission to transact. The next layer of commerce will be built underneath it.

All wallet addresses anonymized. System prompts and user messages paraphrased to remove operator-identifying details. Numerical claims reflect aggregate dataset statistics; no individual wallet, operator, or third-party partner is identified.