Back to Blog

How to Stop OpenClaw From Burning Your Money While You Sleep

Stacks of cash burning — what OpenClaw can do to your API bill overnight
By Percy Kintu · March 5, 2026 · 9 min read
TL;DR

OpenClaw can silently burn $5-$30/day through heartbeats, session bloat, and retry loops. Monitoring dashboards show you the damage after it happens. The only reliable fix is a hard spending cap at the proxy layer that physically blocks API requests once your budget is hit.

You left your OpenClaw agent running overnight to process a batch of code reviews. You wake up, check your Anthropic dashboard, and see $47.83 in charges. For eight hours of work that should have cost maybe $6.

This is not an edge case. This is what happens to most OpenClaw users within their first week. The tool is powerful, but it has no built-in spending controls, and the default configuration is optimized for capability, not cost.

Let's break down exactly where your money goes and what you can do about it.

Why Does OpenClaw Cost So Much?

OpenClaw's costs come from four main sources, and most users only think about one of them. The obvious cost is the work your agent actually does -- reading files, writing code, answering questions. That's usually the smallest part of your bill.

The real money goes to background operations that run whether your agent is productive or not. Understanding these is the first step to controlling them.

What Are Heartbeat Pings and Why Do They Cost $5/Day?

OpenClaw sends a "heartbeat" to its configured model every 30 minutes to maintain session context and check for pending tasks. Each heartbeat includes the full conversation context -- system prompt, recent messages, tool definitions, and session state.

With Claude Sonnet, a typical heartbeat sends roughly 80,000-120,000 input tokens. At Anthropic's pricing of $3/million input tokens, that's about $0.30-$0.35 per heartbeat. The model responds with a short status message, adding another $0.01-$0.02 in output tokens.

Do the math: 48 heartbeats per day times $0.35 each equals roughly $16.80/day or $504/month if you leave your agent running 24/7. Even during the 8-hour window you're actually working, that's still $5.60/day just in heartbeats.

What Is Session History Bloat?

Every time your agent makes an API call, the full conversation history gets sent along with it. OpenClaw appends tool results, file contents, and previous responses to the session context. After 20-30 interactions, your context window can easily hit 150,000+ tokens.

This means a call that would cost $0.05 with a fresh context now costs $0.40-$0.50 because of all the accumulated history. Over a busy work session with 100+ agent interactions, that's an extra $35-$45 in input token costs you didn't plan for.

OpenClaw does have context pruning, but the defaults are conservative. It keeps more history than most tasks require because dropping context risks breaking multi-step reasoning chains.

What Happens When OpenClaw Gets Stuck in a Retry Loop?

This is the most dangerous cost driver because it's unbounded. When OpenClaw hits an error -- a failed tool call, a malformed response, a rate limit -- it retries. And retries. And retries.

There is no hard limit on retry attempts in the default configuration. A single stuck loop can generate 50-200 API calls in minutes, each carrying the full bloated session context. Users have reported single loop incidents costing $15-$80 depending on the model and context size.

The worst part: loops often happen when you're not watching. Your agent hits an edge case at 2 AM, starts retrying, and burns through your budget before you wake up.

How Much Does OpenClaw Actually Cost Per Month?

Here's a realistic cost breakdown for different usage patterns:

Cost Source Light Use (4h/day) Regular (8h/day) Heavy (24/7)
Heartbeats $2.80/day $5.60/day $16.80/day
Productive work $1.50/day $4.00/day $12.00/day
Session bloat overhead $0.80/day $3.00/day $8.00/day
Retry loops (avg) $0.50/day $2.00/day $5.00/day
Daily total $5.60 $14.60 $41.80
Monthly total $168 $438 $1,254

These numbers assume Claude Sonnet pricing. GPT-4o is slightly cheaper per token but has similar patterns. The key insight is that productive work is often less than 30% of your total spend.

Can You Reduce Costs With Model Routing?

Yes, and this is the single highest-impact change you can make. The idea is simple: don't use your most expensive model for everything.

Heartbeats don't need Claude Sonnet. They're status checks. Route them to Gemini Flash (free tier available) or GPT-4o Mini ($0.15/million input tokens) and your heartbeat cost drops from $5.60/day to $0.08/day. That's a 98% reduction on what's typically your largest cost category.

Similarly, simple file reads and directory listings don't need frontier model intelligence. A routing layer that matches request patterns to appropriate models can cut your total spend by 40-60% without any loss in output quality for your actual coding tasks.

Does Session Pruning Actually Help?

It helps, but less than you'd expect. Aggressive pruning -- keeping only the last 5-10 exchanges instead of the full history -- can reduce per-call input tokens by 30-50%. On a busy day, that saves $1-$3.

The tricky part is knowing what to prune. Drop the wrong context and your agent loses track of what it's doing, leading to repeated work (which costs more tokens) or incorrect outputs (which require corrections, costing even more).

Pruning is worth doing, but it's an optimization, not a solution. It reduces the rate of spend but doesn't cap it.

Why Doesn't Monitoring Your API Dashboard Work?

Every cloud provider gives you a usage dashboard. Anthropic shows your token consumption in near-real-time. OpenAI has spending reports. So why isn't that enough?

Three reasons. First, dashboards are reactive. By the time you see the spike, the money is already gone. A retry loop at 3 AM can burn $50 before you check your dashboard at 9 AM.

Second, dashboards don't have hard stops. Anthropic's usage limits are soft -- they throttle your rate but don't block requests entirely. OpenAI removed hard budget limits entirely in 2025 and now only offers spending alerts -- notifications that arrive after the money is already spent.

Third, you're not watching. The whole point of an AI agent is that it works while you do other things. Checking a dashboard every hour defeats the purpose of automation.

What About Anthropic's Built-In Rate Limits?

Anthropic's rate limits control requests per minute and tokens per minute, not dollars per day. They prevent you from DDoS-ing the API, not from spending too much money.

At higher usage tiers, rate limits can reach 2,000-4,000 requests per minute. Even at moderate rates, you could theoretically spend hundreds of dollars per hour if every request carried a large context window. Rate limits protect Anthropic's infrastructure, not your wallet.

Why Do You Need Hard Caps at the Proxy Layer?

A hard cap is different from a soft warning or a dashboard alert. A hard cap means the API request physically does not reach the provider once you've hit your limit. The HTTP request gets a 429 response before it ever leaves your machine.

This is the only approach that's truly safe against runaway costs because it doesn't depend on you being awake, checking a dashboard, or responding to an alert. The cap is enforced by a proxy sitting between OpenClaw and the API. No proxy approval, no API call.

Think of it like a prepaid phone plan versus a postpaid plan with "spending alerts." The prepaid plan physically cannot exceed your balance. The postpaid plan sends you a text message and hopes you notice.

How Do You Set Up a Proxy-Level Spending Cap?

The architecture is straightforward. A local proxy server runs on your machine (typically localhost:5858). You configure OpenClaw to send API requests to the proxy instead of directly to Anthropic or OpenAI. The proxy tracks every request's token usage, calculates cost in real-time, and blocks requests when the daily or monthly cap is reached.

The proxy also handles the secondary cost drivers: it detects heartbeat patterns and routes them to cheaper models, identifies retry loops and breaks them after a configurable threshold, and can pause the agent entirely during hours you define as "sleep" time.

Setting this up manually requires writing request interception logic, token counting for multiple providers, streaming SSE parsing for real-time cost tracking, and a persistence layer for spend history. It's roughly 2,000-3,000 lines of code to do properly.

What Should Your Daily Cap Be?

That depends on your usage, but here are reasonable starting points based on real usage data:

Start lower than you think you need. You can always raise the cap. You can't un-spend money.

What About Monthly Caps?

Monthly caps catch the scenarios that daily caps miss. If you set a $15/day cap but forget to pause your agent over a two-week vacation, you'd still burn $210. A $200/month cap would have stopped that at day 13.

Monthly caps are also useful for budgeting. If your team has a $500/month AI budget, a hard monthly cap guarantees you'll never exceed it, regardless of how many agents are running or how active they are.

How Much Can You Actually Save?

Based on the cost breakdown table above, here's what proper cost controls typically save:

Optimization Monthly Savings
Heartbeat rerouting to cheap model $50 - $150
Loop detection and breaking $15 - $60
Hard daily caps (prevent overnight burn) $30 - $100
Sleep-hour blocking $20 - $80
Total typical savings $115 - $390/month

For a regular user spending $438/month, that's a reduction to roughly $150-$200/month -- paying only for productive work, not waste.

What's the Catch With Manual Solutions?

You can absolutely build all of this yourself. Set up an nginx reverse proxy, write some Lua scripts for token counting, create a cron job to reset daily totals. People do this.

The catch is maintenance. API providers change their response formats. New models have different pricing. Streaming responses require different parsing logic for Anthropic versus OpenAI. When something breaks at 2 AM, your "spending cap" silently fails open and your agent burns money until you fix it.

Every manual solution the community has built eventually breaks when configs change or providers update their APIs. The cost of maintaining a DIY proxy exceeds the cost of a purpose-built tool within a few months.

Stop Guessing. Set a Hard Cap.

ClawCap enforces hard daily and monthly spending caps at the proxy layer. Your agent literally cannot spend more than you set. It also auto-detects heartbeats, breaks retry loops, and gives you a kill switch from your phone.

Get ClawCap -- Free Tier Available

Percy Kintu, creator of ClawCap. Building cost controls for AI agents because nobody should wake up to a surprise $200 API bill.