BlockRun

Responses API

POST /v1/responses is a drop-in replacement for api.openai.com/v1/responses — the native protocol for Codex CLI and agents that need tools + reasoning together on GPT-5.x (a combination OpenAI rejects on /v1/chat/completions). Same body shape, same response bytes, same SSE event stream; you pay per request in USDC over x402 instead of with an API key.

If your client speaks Chat Completions, use Chat Completions instead — both endpoints serve the same models at the same prices.

Endpoint

POST https://blockrun.ai/api/v1/responses

Request

Headers

HeaderRequiredDescription
Content-TypeYesMust be application/json
PAYMENT-SIGNATUREConditionalBase64-encoded x402 payment payload (required after 402, x402 v2)

Body Parameters

ParameterTypeRequiredDescription
modelstringYesOpenAI model ID (e.g., gpt-5.5, openai/gpt-5.4-pro, gpt-5.3-codex)
inputstring | arrayYesA prompt string, or an array of Responses input items (messages, function_call_output, reasoning replays, …)
instructionsstringNoSystem/developer instructions
max_output_tokensintegerNoMaximum tokens to generate (bounds the x402 quote)
streambooleanNoStream native Responses SSE events (default: false)
toolsarrayNoResponses tool definitions — works together with reasoning on all GPT-5.x models
tool_choicestring | objectNoTool selection strategy
reasoningobjectNoReasoning config, e.g. {"effort": "high"}
includearrayNoExtra output fields, e.g. ["reasoning.encrypted_content"] for reasoning continuity across turns
textobjectNoOutput format options (json_object, json_schema, verbosity)
temperature / top_pnumberNoSampling parameters (legacy models only; reasoning models ignore/reject them upstream)

Stateless gateway — what is different from OpenAI

BlockRun's upstream calls run under one org key shared by all payers, so server-side state is disabled. store: false is enforced on every request, and these parameters are rejected with a 400:

ParameterWhy
store: trueResponses are never retained upstream
previous_response_idNo stored responses to reference — resend full context in input
conversationNo server-side conversation state
promptNo stored prompt templates
background: trueBackground jobs complete after the HTTP exchange — nothing to bill against

This is the same model Codex CLI uses by default (store: false + full-context resend). For reasoning continuity across turns, request include: ["reasoning.encrypted_content"] and replay the returned reasoning items in the next call's input.

Supported models

All paid OpenAI models: GPT-5.x (including -pro tiers), o-series, and the codex family. GET /v1/models lists current IDs and prices; the openai/ prefix is optional. Other providers (Claude, Gemini, DeepSeek, …) are served via Chat Completions and Anthropic Messages.

Payment flow

Identical to Chat Completions: send the request without payment, receive a 402 with a USDC quote (estimated input tokens + 10% of max_output_tokens, minimum $0.001), sign the x402 authorization, and resend with the PAYMENT-SIGNATURE header. The SDKs handle this automatically.

Examples

Non-streaming

curl https://blockrun.ai/api/v1/responses \
  -H "Content-Type: application/json" \
  -H "PAYMENT-SIGNATURE: <base64-x402-payload>" \
  -d '{
    "model": "gpt-5.2",
    "input": "Explain x402 in one sentence.",
    "max_output_tokens": 200
  }'

Tools + reasoning (the combination Chat Completions rejects)

curl https://blockrun.ai/api/v1/responses \
  -H "Content-Type: application/json" \
  -H "PAYMENT-SIGNATURE: <base64-x402-payload>" \
  -d '{
    "model": "gpt-5.4",
    "input": "What is the weather in SF? Use the tool.",
    "reasoning": {"effort": "high"},
    "tools": [{
      "type": "function",
      "name": "get_weather",
      "parameters": {
        "type": "object",
        "properties": {"city": {"type": "string"}},
        "required": ["city"]
      }
    }]
  }'

Streaming

Set "stream": true and consume native Responses SSE events (response.created, response.output_text.delta, …, response.completed). The stream is piped through byte-for-byte, so the official OpenAI SDK's event parsing works unchanged.

Response

The native OpenAI Responses object (or SSE event stream) is returned unmodified — id (resp_…), output array with reasoning / message / function_call items, and usage with input_tokens, output_tokens, input_tokens_details.cached_tokens, and output_tokens_details.reasoning_tokens. The X-Payment-Response header carries the settlement transaction hash on non-streaming calls.