Free LLM API.
No key. No wallet.
17 open-source models, hosted free on NVIDIA, routed through BlockRun. No key. No wallet. Six ways to call any of them.
curl https://blockrun.ai/api/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "nvidia/llama-4-maverick",
"messages": [{"role": "user", "content": "Hello"}]
}'6 dedicated pages.
One free endpoint.
Free Llama 4 API.
Meta's Llama 4 Maverick (17B × 128 experts MoE), 131K context. No key, no wallet, no subscription. Just call it.
Free DeepSeek V4 API.
DeepSeek V4 Flash (284B / 13B active, 1M context, ~5× faster than V4 Pro) and V4 Pro (1.6T MoE, 1M context). Frontier reasoning. No key.
Free Qwen3 API.
Qwen3-Next 80B Thinking (3B active, 116 tok/s — the fastest free reasoning we ship) and Qwen3 Coder 480B (35B active, code-tuned). No key.
Free Mistral API.
Mistral Small 4 119B (114 tok/s, fastest free chat). Plus Mistral Large 3 675B (the largest Mistral ever) and Devstral 2 123B for code. All free.
Free Nemotron API. With vision.
Nemotron 3 Nano Omni — 31B / 3.2B active MoE, 256K context. The only free model that takes images, video, and audio. ChartQA 90.3, DocVQA 95.6, MMMU 70.8.
Free GPT-OSS API.
OpenAI's GPT-OSS — the only open-weights models OpenAI ever released. 120B and 20B variants, 128K context. Hosted free on NVIDIA, called through BlockRun.
We don't share
your data.
Your prompt goes to the AI provider you picked. Nothing else, nowhere else. No training, no retention beyond the request, no profile linking.
- No training, no retention beyond the request. Your prompt is forwarded only to the AI provider you select.
- Wallet in, prompt out. Pseudonymous by default — no email, no phone number, no identity documents.
- Read the code, audit the wire format, run it yourself. @blockrun/llm and blockrun-llm on npm and PyPI.
Frontier models are
pay-per-call. Same endpoint.
Claude Opus, GPT-5.5, Gemini 2.5 Pro, Grok, Kimi — 55+ paid models on the same API. No subscription. No monthly minimum. Top up $5 in USDC and call anything.