Free Reasoning API.
No key. No subscription.
Qwen3-Next 80B (262K context, fast) and Nemotron-3 Super 120B — frontier-grade free reasoning. No key, no signup.
Try it now.
No API key. No wallet. No signup. Paste this into any terminal — the response streams back from Reasoning hosted free on NVIDIA, routed through BlockRun.
curl https://blockrun.ai/api/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "nvidia/qwen3-next-80b-a3b-instruct",
"messages": [{"role": "user", "content": "Solve: 7 friends share 23 cookies equally; how many are left over?"}]
}'- Context
- 262K
- Price
- free
- Best for
- reasoning · coding
- Context
- 131K
- Price
- free
- Best for
- reasoning · coding
6 ways to use Reasoning free.
BlockRun is the access layer. Pick the surface that matches how you build — terminal, notebook, IDE, agent runtime — and the same free models work everywhere.
- 01
Franklin Agent
Free reasoning model + Franklin Agent = strong terminal reasoning agent, free
Learn more →shell# Install Franklin Agent curl -fsSL https://franklin.run/install | sh # Run with this model franklin chat --model nvidia/qwen3-next-80b-a3b-instruct "Summarize the README" - python
# Works with the OpenAI SDK — no key required for free models from openai import OpenAI client = OpenAI( base_url="https://blockrun.ai/api/v1", api_key="not-needed-for-free-models", ) response = client.chat.completions.create( model="nvidia/qwen3-next-80b-a3b-instruct", messages=[{"role": "user", "content": "Solve: 7 friends share 23 cookies equally; how many are left over?"}], ) print(response.choices[0].message.content) - shell
curl https://blockrun.ai/api/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "nvidia/qwen3-next-80b-a3b-instruct", "messages": [{"role": "user", "content": "Solve: 7 friends share 23 cookies equally; how many are left over?"}] }' - 04
ClawRouter
smart router for OpenClaw / Claude Code — auto-picks free models when possible
Learn more →shell# Install once npm install -g @blockrun/clawrouter # Then point any OpenAI-compatible client at the local proxy. # ClawRouter routes to nvidia/qwen3-next-80b-a3b-instruct (or the cheapest capable model) # without changing your code. - 05
Claude Code MCP
18 tools for Claude Code, Cursor & ChatGPT — call any free model from inside your editor
Learn more →shell# Add the BlockRun MCP server (Claude Code, Cursor, or ChatGPT desktop) claude mcp add blockrun -s user -- npx -y @blockrun/mcp@latest # Then call from inside the editor: # blockrun_chat(model="nvidia/qwen3-next-80b-a3b-instruct", messages=[{role:"user", content:"…"}]) - typescript
// Works with the OpenAI SDK — no key required for free models import OpenAI from "openai"; const client = new OpenAI({ baseURL: "https://blockrun.ai/api/v1", apiKey: "not-needed-for-free-models", }); const r = await client.chat.completions.create({ model: "nvidia/qwen3-next-80b-a3b-instruct", messages: [{ role: "user", content: "Solve: 7 friends share 23 cookies equally; how many are left over?" }], }); console.log(r.choices[0].message.content);
We don't share
your data.
Your prompt goes to the AI provider you picked. Nothing else, nowhere else. No training, no retention beyond the request, no profile linking.
- No training, no retention beyond the request. Your prompt is forwarded only to the AI provider you select.
- Wallet in, prompt out. Pseudonymous by default — no email, no phone number, no identity documents.
- Read the code, audit the wire format, run it yourself. @blockrun/llm and blockrun-llm on npm and PyPI.
Want Claude, GPT-5,
or Gemini too?
No subscription. No monthly minimum. Pay per call in USDC via x402 — works the same endpoint, same SDK, same model IDs. Connect a wallet, top up $5, call any frontier model. No credit card.
Everything you might
be wondering.
- Which free reasoning model should I pick?
- Default to Qwen3-Next 80B Instruct — fast (~1s), 262K context, strong general + coding reasoning. Nemotron-3 Super 120B is a thinking-mode MoE for harder multi-step reasoning. Both are free, no key.
- Where's DeepSeek V4?
- DeepSeek V4 Flash/Pro were retired from our free upstream (NVIDIA's on-demand tier stopped serving them). Calls to nvidia/deepseek-v4-* are auto-rerouted to a healthy free reasoning model so they still return 200. For dedicated DeepSeek, use the paid deepseek/* models with a wallet.
- Is it really free?
- Yes. Hosted free on NVIDIA's build.nvidia.com tier, passed straight through BlockRun. No signup, no wallet.
- How much context can I use for free?
- Up to the model's window (262K for Qwen3-Next), but request body is capped at 128 KB on the free tier. For multi-megabyte context dumps, switch to paid models with a wallet (5 MB cap).
- What if NVIDIA's upstream is down?
- BlockRun's free-model health gate auto-routes around any dead free model in real time, so your call still returns 200 from a healthy model.