Why two Qwen3 models?

Qwen3-Next 80B Thinking is the speed-tuned reasoning workhorse (116 tok/s, 3B active params). Qwen3 Coder 480B is code-specialised. Pick Thinking for general reasoning, Coder for IDE / refactor / code-gen tasks.

Is Qwen3 Coder good enough for Cursor / Continue / Claude Code?

Yes — 480B MoE with 35B active. It's competitive with closed-source code models on most refactoring and review tasks. Drop it in via ClawRouter or the BlockRun MCP server.

How does it compare to GPT-4 / Claude on code?

Frontier closed models are still ahead on the hardest novel problems, but Qwen3 Coder is dramatically cheaper (free, here) and matches them on most production refactor / explain / lint workflows.

Will my prompts be used to train Qwen?

NVIDIA's free tier (which hosts these models) reserves the right to use prompts for service improvement. Don't send proprietary code you wouldn't paste into a public Discord. For private inference, switch to paid models.

Does the Thinking model show its reasoning?

Yes — set reasoning_effort=high or use the OpenAI thinking_content extension. Reasoning tokens stream as they generate.

Free · Qwen3 Thinking + Coder

Free Qwen3 API.
No key. No subscription.

Name: BlockRun Free Qwen3 API
Author: BlockRun

Qwen3-Next 80B Thinking (3B active, 116 tok/s — the fastest free reasoning we ship) and Qwen3 Coder 480B (35B active, code-tuned). No key.

Try in 10 seconds See all 6 ways Want Claude / GPT-5? Pay-per-call →

Quickstart · 10 seconds

Try it now.

No API key. No wallet. No signup. Paste this into any terminal — the response streams back from Qwen hosted free on NVIDIA, routed through BlockRun.

curl

curl https://blockrun.ai/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nvidia/qwen3-next-80b-a3b-thinking",
    "messages": [{"role": "user", "content": "Refactor this loop into a list comprehension: result = []\nfor x in items:\n  if x > 0: result.append(x*2)"}]
  }'

See all 6 ways Full API reference →

Spotlight

Qwen3-Next 80B Thinking (Free)

nvidia/qwen3-next-80b-a3b-thinking

Context: 131K
Price: free
Best for: reasoning · coding

Qwen3 Coder 480B (Free)

nvidia/qwen3-coder-480b

Context: 131K
Price: free
Best for: coding

Six ways to call it

6 ways to use Qwen free.

BlockRun is the access layer. Pick the surface that matches how you build — terminal, notebook, IDE, agent runtime — and the same free models work everywhere.

ClawRouter

Drop ClawRouter into Cursor or Continue and get free Qwen3 Coder

Learn more →

shell

# Install once
npm install -g @blockrun/clawrouter

# Then point any OpenAI-compatible client at the local proxy.
# ClawRouter routes to nvidia/qwen3-coder-480b (or the cheapest capable model)
# without changing your code.

Claude Code MCP

From Claude Code, call Qwen3 Coder for free as a second-opinion model

Learn more →

shell

# Add the BlockRun MCP server (Claude Code, Cursor, or ChatGPT desktop)
claude mcp add blockrun --transport http https://mcp.blockrun.ai/mcp

# Then call from inside the editor:
#   blockrun_chat(model="nvidia/qwen3-coder-480b", messages=[{role:"user", content:"…"}])

cURL

no key, no wallet, paste in any terminal

Learn more →

shell

curl https://blockrun.ai/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nvidia/qwen3-next-80b-a3b-thinking",
    "messages": [{"role": "user", "content": "Refactor this loop into a list comprehension"}]
  }'

Python SDK

pip install blockrun-llm — or any OpenAI-compatible client

Learn more →

python

# Works with the OpenAI SDK — no key required for free models
from openai import OpenAI

client = OpenAI(
    base_url="https://blockrun.ai/api/v1",
    api_key="not-needed-for-free-models",
)

response = client.chat.completions.create(
    model="nvidia/qwen3-next-80b-a3b-thinking",
    messages=[{"role": "user", "content": "Refactor this loop into a list comprehension"}],
)
print(response.choices[0].message.content)

TypeScript SDK

npm install @blockrun/llm — or any OpenAI-compatible client

Learn more →

typescript

// Works with the OpenAI SDK — no key required for free models
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://blockrun.ai/api/v1",
  apiKey: "not-needed-for-free-models",
});

const r = await client.chat.completions.create({
  model: "nvidia/qwen3-coder-480b",
  messages: [{ role: "user", content: "Refactor this loop into a list comprehension" }],
});
console.log(r.choices[0].message.content);

Franklin

the AI agent with a wallet — free OSS models for routine tasks, paid models on demand

Learn more →

shell

# Install Franklin
curl -fsSL https://franklin.run/install | sh

# Run with this model
franklin chat --model nvidia/qwen3-next-80b-a3b-thinking "Summarize the README"

Trust / Defaults

We don't share
your data.

Your prompt goes to the AI provider you picked. Nothing else, nowhere else. No training, no retention beyond the request, no profile linking.

Read the privacy policy Terms →

We don't share your data: No training, no retention beyond the request. Your prompt is forwarded only to the AI provider you select.
No accounts, no KYC: Wallet in, prompt out. Pseudonymous by default — no email, no phone number, no identity documents.
Open-source SDKs, MIT: Read the code, audit the wire format, run it yourself. @blockrun/llm and blockrun-llm on npm and PyPI.

When free isn't enough

Want Claude, GPT-5,
or Gemini too?

No subscription. No monthly minimum. Pay per call in USDC via x402 — works the same endpoint, same SDK, same model IDs. Connect a wallet, top up $5, call any frontier model. No credit card.

See pay-per-call pricing Get started

FAQ

Everything you might
be wondering.

Why two Qwen3 models?: Qwen3-Next 80B Thinking is the speed-tuned reasoning workhorse (116 tok/s, 3B active params). Qwen3 Coder 480B is code-specialised. Pick Thinking for general reasoning, Coder for IDE / refactor / code-gen tasks.
Is Qwen3 Coder good enough for Cursor / Continue / Claude Code?: Yes — 480B MoE with 35B active. It's competitive with closed-source code models on most refactoring and review tasks. Drop it in via ClawRouter or the BlockRun MCP server.
How does it compare to GPT-4 / Claude on code?: Frontier closed models are still ahead on the hardest novel problems, but Qwen3 Coder is dramatically cheaper (free, here) and matches them on most production refactor / explain / lint workflows.
Will my prompts be used to train Qwen?: NVIDIA's free tier (which hosts these models) reserves the right to use prompts for service improvement. Don't send proprietary code you wouldn't paste into a public Discord. For private inference, switch to paid models.
Does the Thinking model show its reasoning?: Yes — set reasoning_effort=high or use the OpenAI thinking_content extension. Reasoning tokens stream as they generate.

Free Qwen3 API.No key. No subscription.

Try it now.

6 ways to use Qwen free.

ClawRouter

Claude Code MCP

cURL

Python SDK

TypeScript SDK

Franklin

We don't shareyour data.

Want Claude, GPT-5,or Gemini too?

Everything you mightbe wondering.

Free Qwen3 API.
No key. No subscription.

We don't share
your data.

Want Claude, GPT-5,
or Gemini too?

Everything you might
be wondering.