OpenClaw spending control has three layers: provider-level monthly caps (Anthropic/OpenAI dashboards), runtime token limits (openclaw.json config), and proxy-level enforcement (real-time daily caps + pattern detection). Most developers only use the first two and wonder why they still get bill shock. You need all three. This guide shows you exactly how to set up each layer, what each one catches, and where the gaps are.
If you are running OpenClaw agents in 2026, you are probably spending somewhere between $50 and $500 per month on LLM API calls. That range is enormous, and the difference usually comes down to one thing: whether you have proper budget controls in place.
The problem is not that budget controls do not exist. The problem is that they exist at three different levels, each with different strengths and blind spots. Most developers set up one layer, assume they are covered, and then get surprised by a $100+ bill from an overnight session.
This guide covers all three layers in detail. By the end, you will know exactly what each one does, what it misses, and how to configure them to work together.
Think of budget protection as a layered defense. No single layer catches everything, but together they cover virtually every failure mode.
openclaw.json (maxTokens, context window ceilings)Let me walk through each one.
Layer 1
Every major LLM provider lets you set a spending limit on your account. This is the most basic form of budget control and the one most developers set up first.
Anthropic's usage limits are configured in the Anthropic Console under Settings → Limits. You can set a monthly hard cap in dollars. When reached, all API calls return a 429 error until the next billing cycle.
console.anthropic.comCurrent pricing reference (March 2026): Claude Sonnet 4 runs $3/M input tokens and $15/M output tokens. Claude Haiku 3.5 runs $0.80/M input and $4/M output. Claude Opus 4 runs $5/M input and $25/M output.
OpenAI lets you set monthly budget limits in the API Settings dashboard. You can set both a hard cap (requests fail after this) and a soft cap (email notification only).
platform.openai.comCurrent pricing reference: GPT-4o runs $2.50/M input and $10/M output. GPT-4o-mini runs $0.15/M input and $0.60/M output. o3-mini runs $1.10/M input and $4.40/M output.
Google Cloud uses billing budgets that can be configured in the Cloud Console. These are more complex than Anthropic or OpenAI because they integrate with Google Cloud's broader billing infrastructure.
Important: Google Cloud budgets are alerts by default, not hard caps. You need additional automation to actually stop spending when the budget is reached. This is a common gotcha.
Provider caps have three significant blind spots:
Layer 2
OpenClaw supports per-task token limits in its configuration file. These control how many tokens a single request or session can use.
{
"models": {
"maxTokens": 4096,
"contextWindow": 128000,
"providers": [{
"name": "anthropic",
"apiKey": "sk-ant-...",
"models": ["claude-sonnet-4-6"]
}]
}
}
maxTokens limits the output length per API call. Setting this to 4096 means the model will generate at most 4,096 output tokens per request. At Claude Sonnet pricing, that caps individual response cost at roughly $0.06.
contextWindow limits how much conversation history is sent per request. A lower context window means less input token cost per call, but also less context for the model to work with. The tradeoff here is quality vs. cost.
~/.openclaw/openclaw.json in your editormaxTokens to 4096 (good default for most coding tasks)contextWindow to 64000 instead of 128000 if you are cost-sensitive — this roughly halves input token costs per request at the expense of some contextRuntime limits control the size of individual requests, but they have critical gaps:
maxTokens: 4096 caps each request at ~$0.06, but if the agent makes 2,000 requests overnight, that is still $120. Per-request limits say nothing about cumulative spending.Layer 3
This is the layer most developers are missing. A proxy-level cap sits between OpenClaw and the LLM API, intercepting every request in real time. It tracks cumulative spending, detects patterns, and enforces hard limits — not per-request, not per-month, but per-day.
ClawCap is a lightweight local proxy. Every API call from OpenClaw passes through it before reaching Anthropic, OpenAI, or any other provider. The proxy adds negligible overhead per request — a few milliseconds compared to the 500-3000ms of typical LLM response times.
Daily spending caps. You set a dollar amount per day (e.g., $5). When cumulative spending for the current day reaches that limit, all requests return 429. This is the single most important control for preventing overnight bill shock.
Loop detection. The proxy analyzes request patterns in a sliding window. If it detects a cluster of substantially similar requests within a short time period, it flags a loop and blocks further repetitions. This catches the "agent stuck on the same error" pattern within minutes instead of hours.
Heartbeat detection. Periodic low-value maintenance requests at regular intervals are identified as heartbeat calls. The proxy can block or reroute these to cheaper models, preventing the slow bleed that accounts for 40-60% of overnight waste.
Cross-provider tracking. Because the proxy sits in front of all providers, it tracks spending in a single database. An agent using Claude for complex tasks and GPT-4o-mini for simple tasks has one unified budget, not two separate ones.
# Install globally
npm install -g clawcap
# Interactive setup — sets daily cap, API keys, optional Telegram
clawcap init
# Start the proxy
clawcap start
# Verify it is running
clawcap status
During clawcap init, you will be prompted for:
Then update your OpenClaw configuration to route through the proxy:
// ~/.openclaw/openclaw.json
{
"models": {
"providers": [{
"name": "anthropic",
"baseUrl": "http://localhost:PORT"
}, {
"name": "openai",
"baseUrl": "http://localhost:PORT"
}]
}
}
That is the complete setup. Every API call from OpenClaw now passes through ClawCap's enforcement layer.
This table shows what each layer catches and what it misses. The key insight is that no single layer covers everything.
| Capability | Layer 1: Provider | Layer 2: Runtime | Layer 3: Proxy |
|---|---|---|---|
| Monthly spending cap | Yes | No | Yes |
| Daily spending cap | No | No | Yes |
| Per-request token limit | No | Yes | Partial |
| Loop detection | No | No | Yes |
| Heartbeat detection | No | No | Yes |
| Cross-provider tracking | No | No | Yes |
| Real-time alerts | Email only | No | Telegram/push |
| Remote kill switch | No | No | Yes |
| Context window control | No | Yes | No |
| Zero configuration | Dashboard UI | JSON config | 2-min setup |
| Works when proxy is down | Yes | Yes | No |
Notice that "Works when proxy is down" is a No for Layer 3. This is why you need Layer 1 as a backstop. If ClawCap crashes or your machine reboots and the proxy does not restart, provider-level caps are your safety net. Always have both.
Here is a concrete example for a developer spending roughly $150/month on OpenClaw with Claude Sonnet as the primary model and GPT-4o-mini as a secondary model for simple tasks.
maxTokens: 4096 (caps per-response cost at ~$0.06 for Sonnet)contextWindow: 64000 (halves input cost vs. 128K default)With this configuration, the worst-case daily loss is $8. The worst-case monthly loss is $150. Loop patterns are caught in under 3 minutes. And you get a phone notification well before hitting any limit.
Understanding the failure modes helps you configure the layers to complement each other.
Layer 1 triggers (provider monthly cap): All requests to that provider return 429. OpenClaw surfaces the error and stops. Other providers still work. Recovery: wait for the next billing cycle, or increase the limit in the provider dashboard.
Layer 2 triggers (token limit): The model's response is truncated at the maxTokens limit. This is not an error — the response just ends. The agent may retry with a different approach, or it may not realize the response was truncated. This layer degrades gracefully but can cause subtle bugs.
Layer 3 triggers (proxy daily cap): ClawCap returns 429 with a clear error body: {"error": {"type": "cap_reached", "message": "Daily cap of $8.00 reached ($8.02 spent today)"}}. OpenClaw sees the error and stops. Recovery: wait until tomorrow (cap resets at midnight), or manually resume with clawcap resume or via Telegram.
Layer 3 triggers (loop detected): ClawCap returns 429 with a loop_detected error type and a message describing the pattern. The agent is paused, not killed — you can review what happened and resume if the loop was a false positive.
Based on real usage data from developers running OpenClaw with various configurations:
| Configuration | Average monthly spend | Worst single-day spike | Waste % |
|---|---|---|---|
| No caps at all | $320 | $103 | ~45% |
| Layer 1 only (provider cap) | $200 | $87 | ~35% |
| Layers 1 + 2 (provider + runtime) | $175 | $62 | ~28% |
| All three layers | $130 | $8 | ~5% |
The biggest jump comes from adding Layer 3. Going from Layers 1+2 to all three layers reduces the worst single-day spike from $62 to $8 — a 87% reduction. It also cuts waste from 28% to 5%, because loops and heartbeats are caught in real time instead of being billed at full price.
Many developers configure OpenClaw with multiple providers for cost optimization: Claude Sonnet for complex reasoning, GPT-4o-mini for simple tasks, and maybe DeepSeek for bulk processing. This is smart, but it creates a budget visibility problem.
With only Layer 1 (provider caps), each provider tracks spending independently. You might set $100 on Anthropic and $50 on OpenAI, intending to spend $150 total. But if the agent model-hops aggressively, you could hit $90 on Anthropic and $45 on OpenAI in the same day — $135 total, under each provider's limit but way over your daily budget.
Layer 3 solves this with unified tracking. ClawCap calculates the dollar cost of every request regardless of which provider handles it, using its built-in pricing table covering 60+ models. Your $8 daily cap is $8 total across all providers.
The pricing table covers Anthropic (Claude Opus, Sonnet, Haiku), OpenAI (GPT-4o, GPT-4o-mini, o3-mini, o3), Google (Gemini 2.5 Pro, Flash), xAI (Grok), DeepSeek, Mistral, Groq, MiniMax, and Moonshot/Kimi. Token costs are calculated using each model's specific per-million-token rates.
These are guidelines based on typical usage patterns. Adjust based on your actual workload.
| Use Case | Daily Cap | Monthly Cap | Recommended Model |
|---|---|---|---|
| Hobby / learning | $2-3 | $30 | Claude Haiku / GPT-4o-mini |
| Individual developer | $5-10 | $100-150 | Claude Sonnet / GPT-4o |
| Power user / full-time | $15-25 | $300 | Claude Sonnet + Haiku |
| Team (5 devs) | $50-80 | $800 | Mixed (per-agent caps) |
For most individual developers, a $5-10 daily cap with Claude Sonnet provides 1-3 hours of active agent work per day. That is enough for most workflows. If you consistently hit the cap before finishing your tasks, bump it up in $5 increments.
Each layer has a specific job:
Remove any one layer and you have a gap. Without Layer 1, a proxy crash means unlimited spending. Without Layer 2, each request is more expensive than it needs to be, eating through your daily cap faster. Without Layer 3, you have no daily limits, no loop detection, and no way to stop a runaway session from your phone at 3 AM.
The three layers together give you defense in depth. Provider caps are your circuit breaker. Runtime limits are your per-request optimizer. Proxy enforcement is your real-time guardian.
Set up all three and your OpenClaw spending becomes predictable, bounded, and visible. That is the goal.
Daily caps, loop detection, heartbeat blocking, and Telegram alerts. Free tier includes $5/day cap enforcement. Setup takes 2 minutes.
Get Started with ClawCap