Pay-per-request access to GPT-5.2, Claude 4, Gemini 2.5, Grok, and more via x402 micropayments.
BlockRun assumes Claude Code as the agent runtime.
| Chain | Network | Payment | Status |
|---|---|---|---|
| Base | Base Mainnet (Chain ID: 8453) | USDC | âś… Primary |
| Base Testnet | Base Sepolia (Chain ID: 84532) | Testnet USDC | âś… Development |
| XRPL | XRP Ledger Mainnet | RLUSD | âś… New |
Protocol: x402 v2
pip install blockrun-llmfrom blockrun_llm import LLMClient
client = LLMClient() # Uses BLOCKRUN_WALLET_KEY (never sent to server)
response = client.chat("openai/gpt-5.2", "Hello!")That's it. The SDK handles x402 payment automatically.
Let the SDK automatically pick the cheapest capable model for each request:
from blockrun_llm import LLMClient
client = LLMClient()
# Auto-routes to cheapest capable model
result = client.smart_chat("What is 2+2?")
print(result.response) # '4'
print(result.model) # 'nvidia/kimi-k2.5' (cheap, fast)
print(f"Saved {result.routing.savings * 100:.0f}%") # 'Saved 94%'
# Complex reasoning task -> routes to reasoning model
result = client.smart_chat("Prove the Riemann hypothesis step by step")
print(result.model) # 'xai/grok-4-1-fast-reasoning'| Profile | Description | Best For |
|---|---|---|
free |
nvidia/gpt-oss-120b only (FREE) | Testing, development |
eco |
Cheapest models per tier (DeepSeek, xAI) | Cost-sensitive production |
auto |
Best balance of cost/quality (default) | General use |
premium |
Top-tier models (OpenAI, Anthropic) | Quality-critical tasks |
# Use premium models for complex tasks
result = client.smart_chat(
"Write production-grade async Python code",
routing_profile="premium"
)
print(result.model) # 'anthropic/claude-opus-4.5'ClawRouter uses a 14-dimension rule-based classifier to analyze each request:
- Token count - Short vs long prompts
- Code presence - Programming keywords
- Reasoning markers - "prove", "step by step", etc.
- Technical terms - Architecture, optimization, etc.
- Creative markers - Story, poem, brainstorm, etc.
- Agentic patterns - Multi-step, tool use indicators
The classifier runs in <1ms, 100% locally, and routes to one of four tiers:
| Tier | Example Tasks | Auto Profile Model |
|---|---|---|
| SIMPLE | "What is 2+2?", definitions | nvidia/kimi-k2.5 |
| MEDIUM | Code snippets, explanations | xai/grok-code-fast-1 |
| COMPLEX | Architecture, long documents | google/gemini-3-pro-preview |
| REASONING | Proofs, multi-step reasoning | xai/grok-4-1-fast-reasoning |
- You send a request to BlockRun's API
- The API returns a 402 Payment Required with the price
- The SDK automatically signs a USDC payment on Base
- The request is retried with the payment proof
- You receive the AI response
Your private key never leaves your machine - it's only used for local signing.
| Model | Input Price | Output Price |
|---|---|---|
openai/gpt-5.2 |
$1.75/M | $14.00/M |
openai/gpt-5-mini |
$0.25/M | $2.00/M |
openai/gpt-5-nano |
$0.05/M | $0.40/M |
openai/gpt-5.2-pro |
$21.00/M | $168.00/M |
| Model | Input Price | Output Price |
|---|---|---|
openai/gpt-4.1 |
$2.00/M | $8.00/M |
openai/gpt-4.1-mini |
$0.40/M | $1.60/M |
openai/gpt-4.1-nano |
$0.10/M | $0.40/M |
openai/gpt-4o |
$2.50/M | $10.00/M |
openai/gpt-4o-mini |
$0.15/M | $0.60/M |
| Model | Input Price | Output Price |
|---|---|---|
openai/o1 |
$15.00/M | $60.00/M |
openai/o1-mini |
$1.10/M | $4.40/M |
openai/o3 |
$2.00/M | $8.00/M |
openai/o3-mini |
$1.10/M | $4.40/M |
openai/o4-mini |
$1.10/M | $4.40/M |
| Model | Price |
|---|---|
openai/gpt-oss-20b |
$0.001/request |
openai/gpt-oss-120b |
$0.002/request |
Testnet models use flat pricing (no token counting) for simplicity.
| Model | Input Price | Output Price |
|---|---|---|
anthropic/claude-opus-4.5 |
$5.00/M | $25.00/M |
anthropic/claude-opus-4 |
$15.00/M | $75.00/M |
anthropic/claude-sonnet-4 |
$3.00/M | $15.00/M |
anthropic/claude-haiku-4.5 |
$1.00/M | $5.00/M |
| Model | Input Price | Output Price |
|---|---|---|
google/gemini-3-pro-preview |
$2.00/M | $12.00/M |
google/gemini-2.5-pro |
$1.25/M | $10.00/M |
google/gemini-2.5-flash |
$0.15/M | $0.60/M |
| Model | Input Price | Output Price |
|---|---|---|
deepseek/deepseek-chat |
$0.28/M | $0.42/M |
deepseek/deepseek-reasoner |
$0.28/M | $0.42/M |
| Model | Input Price | Output Price | Context | Notes |
|---|---|---|---|---|
xai/grok-3 |
$3.00/M | $15.00/M | 131K | Flagship |
xai/grok-3-fast |
$5.00/M | $25.00/M | 131K | Tool calling optimized |
xai/grok-3-mini |
$0.30/M | $0.50/M | 131K | Fast & affordable |
xai/grok-4-1-fast-reasoning |
$0.20/M | $0.50/M | 2M | Latest, chain-of-thought |
xai/grok-4-1-fast-non-reasoning |
$0.20/M | $0.50/M | 2M | Latest, direct response |
xai/grok-4-fast-reasoning |
$0.20/M | $0.50/M | 2M | Step-by-step reasoning |
xai/grok-4-fast-non-reasoning |
$0.20/M | $0.50/M | 2M | Quick responses |
xai/grok-code-fast-1 |
$0.20/M | $1.50/M | 256K | Code generation |
xai/grok-4-0709 |
$0.20/M | $1.50/M | 256K | Premium quality |
xai/grok-2-vision |
$2.00/M | $10.00/M | 32K | Vision capabilities |
| Model | Input Price | Output Price |
|---|---|---|
moonshot/kimi-k2.5 |
$0.50/M | $2.40/M |
| Model | Input Price | Output Price | Notes |
|---|---|---|---|
nvidia/gpt-oss-120b |
FREE | FREE | OpenAI open-weight 120B (Apache 2.0) |
nvidia/kimi-k2.5 |
$0.55/M | $2.50/M | Moonshot 1T MoE with vision |
All models below have been tested end-to-end via the Python SDK (Feb 2026):
| Provider | Model | Status |
|---|---|---|
| OpenAI | openai/gpt-4o-mini |
Passed |
| Anthropic | anthropic/claude-sonnet-4 |
Passed |
google/gemini-2.5-flash |
Passed | |
| DeepSeek | deepseek/deepseek-chat |
Passed |
| xAI | xai/grok-3-fast |
Passed |
| Moonshot | moonshot/kimi-k2.5 |
Passed |
| Model | Price |
|---|---|
openai/dall-e-3 |
$0.04-0.08/image |
openai/gpt-image-1 |
$0.02-0.04/image |
black-forest/flux-1.1-pro |
$0.04/image |
google/nano-banana |
$0.05/image |
google/nano-banana-pro |
$0.10-0.15/image |
from blockrun_llm import LLMClient
client = LLMClient() # Uses BLOCKRUN_WALLET_KEY (never sent to server)
response = client.chat("openai/gpt-5.2", "Explain quantum computing")
print(response)
# With system prompt
response = client.chat(
"anthropic/claude-sonnet-4",
"Write a haiku",
system="You are a creative poet."
)Note: Live Search can take 30-120+ seconds as it searches multiple sources. The SDK automatically uses a 5-minute timeout for search requests.
from blockrun_llm import LLMClient
client = LLMClient()
# Simple: Enable live search with search=True (default 10 sources, ~$0.26)
response = client.chat(
"xai/grok-3",
"What are the latest posts from @blockrunai?",
search=True
)
print(response)
# Custom: Limit sources to reduce cost (5 sources, ~$0.13)
response = client.chat(
"xai/grok-3",
"What's trending on X?",
search_parameters={"mode": "on", "max_search_results": 5}
)
# Custom timeout (if 5 min isn't enough)
client = LLMClient(search_timeout=600.0) # 10 minutesfrom blockrun_llm import LLMClient
client = LLMClient()
response = client.chat("openai/gpt-5.2", "Explain quantum computing")
print(response)
# Check how much was spent
spending = client.get_spending()
print(f"Spent ${spending['total_usd']:.4f} across {spending['calls']} calls")from blockrun_llm import LLMClient
client = LLMClient() # Uses BLOCKRUN_WALLET_KEY (never sent to server)
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "How do I read a file in Python?"}
]
result = client.chat_completion("openai/gpt-5.2", messages)
print(result.choices[0].message.content)import asyncio
from blockrun_llm import AsyncLLMClient
async def main():
async with AsyncLLMClient() as client:
# Simple chat
response = await client.chat("openai/gpt-5.2", "Hello!")
print(response)
# Multiple requests concurrently
tasks = [
client.chat("openai/gpt-5.2", "What is 2+2?"),
client.chat("anthropic/claude-sonnet-4", "What is 3+3?"),
client.chat("google/gemini-2.5-flash", "What is 4+4?"),
]
responses = await asyncio.gather(*tasks)
for r in responses:
print(r)
asyncio.run(main())from blockrun_llm import LLMClient
client = LLMClient()
models = client.list_models()
for model in models:
print(f"{model['id']}: ${model['inputPrice']}/M input, ${model['outputPrice']}/M output")For development and testing without real USDC, use the testnet:
from blockrun_llm import testnet_client
# Create testnet client (uses Base Sepolia)
client = testnet_client() # Uses BLOCKRUN_WALLET_KEY
# Chat with testnet model
response = client.chat("openai/gpt-oss-20b", "Hello!")
print(response)
# Check testnet USDC balance
balance = client.get_balance()
print(f"Testnet USDC: ${balance:.4f}")- Get testnet ETH from Alchemy Base Sepolia Faucet
- Get testnet USDC from Circle USDC Faucet
- Set your wallet key:
export BLOCKRUN_WALLET_KEY=0x...
openai/gpt-oss-20b- $0.001/request (flat price)openai/gpt-oss-120b- $0.002/request (flat price)
from blockrun_llm import LLMClient
# Or configure manually
client = LLMClient(api_url="https://testnet.blockrun.ai/api")
response = client.chat("openai/gpt-oss-20b", "Hello!")BlockRun now supports payments with RLUSD on the XRP Ledger. Same models, same API - just a different payment rail.
from blockrun_llm import xrpl_client
# Create XRPL client (pays with RLUSD)
client = xrpl_client() # Uses BLOCKRUN_WALLET_KEY
# Chat with any model
response = client.chat("openai/gpt-4o", "Hello!")
print(response)
# Check RLUSD balance
balance = client.get_balance()
print(f"RLUSD: ${balance:.4f}")import asyncio
from blockrun_llm import async_xrpl_client
async def main():
async with async_xrpl_client() as client:
response = await client.chat("openai/gpt-4o", "Hello!")
print(response)
asyncio.run(main())from blockrun_llm import LLMClient
# Or configure manually
client = LLMClient(api_url="https://xrpl.blockrun.ai/api")
response = client.chat("openai/gpt-4o", "Hello!")| Variable | Description | Required |
|---|---|---|
BLOCKRUN_WALLET_KEY |
Your Base chain wallet private key | Yes (or pass to constructor) |
BLOCKRUN_API_URL |
API endpoint | No (default: https://blockrun.ai/api) |
- Create a wallet on Base network (Coinbase Wallet, MetaMask, etc.)
- Get some ETH on Base for gas (small amount, ~$1)
- Get USDC on Base for API payments
- Export your private key and set it as
BLOCKRUN_WALLET_KEY
# .env file
BLOCKRUN_WALLET_KEY=0x...your_private_key_herefrom blockrun_llm import LLMClient, APIError, PaymentError
client = LLMClient()
try:
response = client.chat("openai/gpt-5.2", "Hello!")
except PaymentError as e:
print(f"Payment failed: {e}")
# Check your USDC balance
except APIError as e:
print(f"API error ({e.status_code}): {e}")Unit tests do not require API access or funded wallets:
pytest tests/unit # Run unit tests only
pytest tests/unit --cov # Run with coverage report
pytest tests/unit -v # Verbose outputIntegration tests call the production API and require:
- A funded Base wallet with USDC ($1+ recommended)
BLOCKRUN_WALLET_KEYenvironment variable set- Estimated cost: ~$0.05 per test run
export BLOCKRUN_WALLET_KEY=0x...
pytest tests/integration # Run integration tests only
pytest # Run all testsIntegration tests are automatically skipped if BLOCKRUN_WALLET_KEY is not set.
- Private key stays local: Your key is only used for signing on your machine
- No custody: BlockRun never holds your funds
- Verify transactions: All payments are on-chain and verifiable
Private Key Management:
- Use environment variables, never hard-code keys
- Use dedicated wallets for API payments (separate from main holdings)
- Set spending limits by only funding payment wallets with small amounts
- Never commit
.envfiles to version control - Rotate keys periodically
Input Validation: The SDK validates all inputs before API requests:
- Private keys (format, length, valid hex)
- API URLs (HTTPS required for production, HTTP allowed for localhost)
- Model names and parameters (ranges for max_tokens, temperature, top_p)
Error Sanitization: API errors are automatically sanitized to prevent sensitive information leaks.
Monitoring:
address = client.get_wallet_address()
print(f"View transactions: https://basescan.org/address/{address}")Keep Updated:
pip install --upgrade blockrun-llm # Get security patchesMIT