Which Mistral should I pick?

Default to Mistral Small 4 119B — it's our fastest free chat model at 114 tok/s. Use Mistral Large 3 675B when you need more raw capacity. Use Devstral 2 123B when the workload is code-specific.

Is the Mistral API free for production use?

BlockRun routes free requests through NVIDIA's free tier with no per-call billing. Throughput is subject to NVIDIA's quota — for guaranteed SLA, switch to paid models with a wallet.

Does Devstral support function calling?

Yes — Devstral 2 emits OpenAI-compatible tool_calls. Same wire format as paid models, just route through the free endpoint.

What's the context window?

131K tokens for all three Mistral free models, with a 128 KB request-body cap on the free tier.

Free · Mistral Small 4 + Large 3 + Devstral

Free Mistral API.
No key. No subscription.

Name: BlockRun Free Mistral API
Author: BlockRun

Mistral Small 4 119B (114 tok/s, fastest free chat). Plus Mistral Large 3 675B (the largest Mistral ever) and Devstral 2 123B for code. All free.

Try in 10 seconds See all 6 ways Want Claude / GPT-5? Pay-per-call →

Quickstart · 10 seconds

Try it now.

No API key. No wallet. No signup. Paste this into any terminal — the response streams back from Mistral hosted free on NVIDIA, routed through BlockRun.

curl

curl https://blockrun.ai/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nvidia/mistral-small-4-119b",
    "messages": [{"role": "user", "content": "Write a one-paragraph bio for an open-source maintainer"}]
  }'

See all 6 ways Full API reference →

Spotlight

Mistral Small 4 119B (Free)

nvidia/mistral-small-4-119b

Context: 131K
Price: free
Best for: coding

Mistral Large 3 675B (Free)

nvidia/mistral-large-3-675b

Context: 131K
Price: free
Best for: reasoning · coding

Devstral 2 123B (Free)

nvidia/devstral-2-123b

Context: 131K
Price: free
Best for: coding

Six ways to call it

6 ways to use Mistral free.

BlockRun is the access layer. Pick the surface that matches how you build — terminal, notebook, IDE, agent runtime — and the same free models work everywhere.

cURL

no key, no wallet, paste in any terminal

Learn more →

shell

curl https://blockrun.ai/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nvidia/mistral-small-4-119b",
    "messages": [{"role": "user", "content": "Write a one-paragraph bio for an open-source maintainer"}]
  }'

Python SDK

pip install blockrun-llm — or any OpenAI-compatible client

Learn more →

python

# Works with the OpenAI SDK — no key required for free models
from openai import OpenAI

client = OpenAI(
    base_url="https://blockrun.ai/api/v1",
    api_key="not-needed-for-free-models",
)

response = client.chat.completions.create(
    model="nvidia/mistral-small-4-119b",
    messages=[{"role": "user", "content": "Write a one-paragraph bio for an open-source maintainer"}],
)
print(response.choices[0].message.content)

ClawRouter

smart router for OpenClaw / Claude Code — auto-picks free models when possible

Learn more →

shell

# Install once
npm install -g @blockrun/clawrouter

# Then point any OpenAI-compatible client at the local proxy.
# ClawRouter routes to nvidia/mistral-small-4-119b (or the cheapest capable model)
# without changing your code.

TypeScript SDK

npm install @blockrun/llm — or any OpenAI-compatible client

Learn more →

typescript

// Works with the OpenAI SDK — no key required for free models
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://blockrun.ai/api/v1",
  apiKey: "not-needed-for-free-models",
});

const r = await client.chat.completions.create({
  model: "nvidia/mistral-small-4-119b",
  messages: [{ role: "user", content: "Write a one-paragraph bio for an open-source maintainer" }],
});
console.log(r.choices[0].message.content);

Franklin

the AI agent with a wallet — free OSS models for routine tasks, paid models on demand

Learn more →

shell

# Install Franklin
curl -fsSL https://franklin.run/install | sh

# Run with this model
franklin chat --model nvidia/mistral-small-4-119b "Summarize the README"

Claude Code MCP

8 tools for Claude Code, Cursor & ChatGPT — call any free model from inside your editor

Learn more →

shell

# Add the BlockRun MCP server (Claude Code, Cursor, or ChatGPT desktop)
claude mcp add blockrun --transport http https://mcp.blockrun.ai/mcp

# Then call from inside the editor:
#   blockrun_chat(model="nvidia/mistral-small-4-119b", messages=[{role:"user", content:"…"}])

Trust / Defaults

We don't share
your data.

Your prompt goes to the AI provider you picked. Nothing else, nowhere else. No training, no retention beyond the request, no profile linking.

Read the privacy policy Terms →

We don't share your data: No training, no retention beyond the request. Your prompt is forwarded only to the AI provider you select.
No accounts, no KYC: Wallet in, prompt out. Pseudonymous by default — no email, no phone number, no identity documents.
Open-source SDKs, MIT: Read the code, audit the wire format, run it yourself. @blockrun/llm and blockrun-llm on npm and PyPI.

When free isn't enough

Want Claude, GPT-5,
or Gemini too?

No subscription. No monthly minimum. Pay per call in USDC via x402 — works the same endpoint, same SDK, same model IDs. Connect a wallet, top up $5, call any frontier model. No credit card.

See pay-per-call pricing Get started

FAQ

Everything you might
be wondering.

Which Mistral should I pick?: Default to Mistral Small 4 119B — it's our fastest free chat model at 114 tok/s. Use Mistral Large 3 675B when you need more raw capacity. Use Devstral 2 123B when the workload is code-specific.
Is the Mistral API free for production use?: BlockRun routes free requests through NVIDIA's free tier with no per-call billing. Throughput is subject to NVIDIA's quota — for guaranteed SLA, switch to paid models with a wallet.
Does Devstral support function calling?: Yes — Devstral 2 emits OpenAI-compatible tool_calls. Same wire format as paid models, just route through the free endpoint.
What's the context window?: 131K tokens for all three Mistral free models, with a 128 KB request-body cap on the free tier.

Free Mistral API.No key. No subscription.

Try it now.

6 ways to use Mistral free.

cURL

Python SDK

ClawRouter

TypeScript SDK

Franklin

Claude Code MCP

We don't shareyour data.

Want Claude, GPT-5,or Gemini too?

Everything you mightbe wondering.

Free Mistral API.
No key. No subscription.

We don't share
your data.

Want Claude, GPT-5,
or Gemini too?

Everything you might
be wondering.