(Quick framing: Claude Code is Anthropic’s terminal CLI for Claude; Claude Max is the higher-tier subscription plan with weekly and 5-hour usage windows. HTTP 429 is “Too Many Requests” — the server’s polite way of saying “back off.” Retry-After is the response header that tells the client how long to wait. OAuth is the auth protocol Claude Code uses to talk to Anthropic on behalf of a logged-in user. And the statusline — the same one I covered in the statusline side-channel post — is the script Claude Code spawns every turn with a JSON blob on stdin.)
I built the wrong thing. Not in the “shipped it and it was slow” sense — in the “the API I was polling was never meant to be polled” sense. The piece of the system I needed had been sitting on /dev/stdin of a script I’d already installed, on every Claude turn, for the whole 18 hours I was watching the daemon log fill up with 429s.
This is the post-mortem.
Symptom: a banlist that never thawed #
ClaudeDeck has a Stream Deck key that shows my Claude Max plan usage — the 5-hour block and the 7-day weekly window. Two percentages, a small countdown to the next reset, color-graded so I can tell from across the room whether I’m about to get rate-limited mid-flow.
The first version was a polling loop. Every 60 seconds, the daemon hit https://api.anthropic.com/api/oauth/usage with the same OAuth token Claude Code uses, parsed the response, and pushed the numbers over WebSocket to the Stream Deck plugin. Standard background-poller shape.
It never worked. Not once. Over 18 hours and change, the log line plan:error reason=429 repeated 111 times and plan:update never appeared. The token wasn’t expired (an expired token would be 401, not 429). The cadence was getting hammered by MAX_BACKOFF_MS = 10 * 60_000, which meant the worst case was one request per ten minutes — call it ~108 requests per ban window. Every single one a 429.
When I finally curled the endpoint by hand to see what was coming back, Anthropic was being polite about it (those cf-ray/server: cloudflare headers mean the request is being rejected at Cloudflare’s edge, not the origin — Anthropic puts Cloudflare in front of the API):
HTTP/2 429
date: Mon, 11 May 2026 01:55:01 GMT
retry-after: 272
content-type: application/json
cf-ray: 9f9d8f1e6db975ec-SEA
server: cloudflare
{"error": {"type": "rate_limit_error",
"message": "Rate limited. Please try again later."}}Retry-After: 272. Wait four and a half minutes, try again. My poller wasn’t reading that header at all. The error path in daemon/src/claudeAiFetcher.ts was returning a string, discarding the whole Response, and the backoff math in planUsagePoller.ts was running on its own 2^N schedule decoupled from what Anthropic was actually asking.
So I sat down to write a parseRetryAfter function. RFC 9110 (the current HTTP/1.1 semantics spec — the document that defines what a header like Retry-After actually means) says the header is either delta-seconds (an integer like 272) or an HTTP-date (a full timestamp like Wed, 21 Oct 2026 07:28:00 GMT); both shapes need to work. A kind: "rate_limited"; retryAfterMs variant for the fetcher to return. New tests for the integer path and the IMF-fixdate path (IMF-fixdate is the RFC 9110 name for the fixed-width “day, DD Mon YYYY HH:MM:SS GMT” date format — the only date shape modern HTTP allows) and the “garbage header” fallback. Wire updateCadence to honor it. Maybe two or three hours of careful work.
This was the wrong fix. None of it would have helped.
Investigation: what was I even polling? #
Before I shipped a careful Retry-After parser, I checked whether the endpoint I was being so careful with was actually one I was supposed to be hitting.
/api/oauth/usage is not in Anthropic’s public REST API docs. It’s not in the Messages API (Anthropic’s main public endpoint — the one SDKs and most apps use), the OAuth flow docs, the workspace billing endpoints — none of them. The community knows about it because claude.ai’s web UI hits it to populate the little usage panel (you can confirm this yourself by opening DevTools’ Network tab on claude.ai). A handful of third-party tools — AgentDeck’s bridge, ohugonnot’s claude-code-statusline shell script, jens-duttke’s usage-monitor-for-claude — poll it from background processes, with care.
How much care? AgentDeck’s bridge/src/usage-api.ts runs a 120-second file-cache TTL, with an explicit comment saying it was raised from 60 seconds “to avoid 429”. ohugonnot’s tool defaults to 300 seconds and warns against setting it lower. jens-duttke’s tool offers 5-minute, 15-minute, and 30-minute presets and refuses to let you pick less. Industry consensus on the endpoint, among the three independent tools polling it, was somewhere between 2 and 5 minutes. Mine was hitting it at 60.
That fixed the obvious problem — the poller was 2–5× too aggressive — but didn’t explain the 100% 429 rate. Sixty seconds is loud, not radioactive. Three other repos were polling at similar cadences without continuous bans.
Then I went looking at the issue tracker. Specifically:
- anthropics/claude-code #31021 — closed
not planned. - anthropics/claude-code #31637 — closed
invalid. Reports 30+ minute bans against this endpoint. - anthropics/claude-code #30930 —
Retry-After: 0bug, also closed without a fix.
Three separate issues against this endpoint, three “not our problem” closures. That’s not a triage backlog. That’s a signal: the surface isn’t being supported. It’s the URL claude.ai’s React app hits on page load to show you the little panel, and Anthropic is fine if it occasionally falls over because the web UI’s failure mode is “panel is blank for thirty seconds.” A daemon doing 60-second polls is not a use case they care to keep working.
The poller wasn’t broken. The endpoint was the wrong primitive.
Root cause: the data was already pushed, on stdin, every turn #
While I was reading those closed issues, I tabbed over to Claude Code’s statusline docs for an unrelated reason — I was already using the statusline command as a per-turn telemetry side channel for context-window data — and noticed a field I’d never registered before.
The JSON Claude Code pipes to your statusline command’s stdin, per turn, includes this:
{
"session_id": "01J9...",
"model": { "id": "claude-opus-4-7", "display_name": "Opus 4.7" },
"workspace": { "current_dir": "/Users/nick/repo" },
"rate_limits": {
"five_hour": {
"used_percentage": 51,
"resets_at": 1746920100
},
"seven_day": {
"used_percentage": 25,
"resets_at": 1747526400
}
},
"context_window": {
"context_window_size": 1000000,
"used_percentage": 17,
"total_input_tokens": 174903
}
}rate_limits.five_hour.used_percentage. rate_limits.seven_day.used_percentage. resets_at Unix timestamps. The exact two numbers and two countdowns my Stream Deck key wanted to render. Pushed to a shell script I’d already installed. Every turn. No HTTP, no token, no Cloudflare. The codelynx.dev write-up confirmed it shipped in Claude Code v1.2.80, which had been my installed version for weeks.
Read that again. The data was being shoved at me, push-style, by a documented contract, while I was busy tuning the backoff on an undocumented endpoint that returns the same data and bans you for asking.
The statusline command I’d already shipped — the one I’d written a whole separate post about as a telemetry forwarder — was already getting this JSON. I was reading context_window.used_percentage out of it and ignoring the rate_limits field two keys away.
That was the moment the design changed.
Fix: subscribe, don’t poll #
The new architecture is one extra path in the statusline forwarder and a few fields in the daemon’s state store.
The statusline command runs every turn and POSTs the JSON to a daemon endpoint:
# cli/src/statusline.sh — abridged
INPUT="$(cat)"
SID="$(printf '%s' "$INPUT" | jq -r '.session_id // empty')"
if [ -n "$SID" ]; then
printf '%s' "$INPUT" \
| curl -sf -m 1 -X POST "http://127.0.0.1:9127/context/$SID" \
-H "Content-Type: application/json" \
--data-binary @- > /dev/null 2>&1 &
fiThe daemon receives that POST, parses rate_limits.* along with the context_window.* it was already consuming, updates stateStore.rateLimits and stateStore.lastStatuslineAt, and broadcasts. The Stream Deck plugin reads rateLimits.five_hour.used_percentage from the WebSocket state and renders. End-to-end latency is a few milliseconds. The poller doesn’t run.
The poller doesn’t go away, though — it gates on freshness. When lastStatuslineAt is recent, the tick short-circuits before hitting Anthropic. When no Claude session has been active for ten minutes, the poller resumes covering the display, at a much saner cadence (5-minute base, 30-minute cap) and now actually honoring Retry-After. The freshness gate is the load-bearing change; the poller’s job shrank from “primary data source” to “fallback when nobody’s running Claude”:
// daemon/src/planUsagePoller.ts — the gate
private static readonly STATUSLINE_FRESH_MS = 10 * 60_000;
const lastStatusline = this.getLastStatuslineAt();
if (lastStatusline > 0) {
const age = this.clock() - lastStatusline;
if (age < PlanUsagePoller.STATUSLINE_FRESH_MS) {
console.log(`plan poll skipped — statusline fresh (${age}ms ago)`);
return;
}
}Net result, looking at daemon.log on a regular workday: zero polls, zero 429s. Every turn through Claude refreshes the numbers, fresher than the polled endpoint ever was, with no rate-limit exposure. The Stream Deck key is now noticeably more responsive — the 5-hour bucket ticks up during the same turn that crossed the threshold, not on the next 60-second poll.
The endpoint I’d been hammering for 18 hours is still in the codebase, behind the gate, polling at five minutes when there’s no Claude session — exactly the cadence the other three tools converged on. It’s the fallback path now, not the primary one.
Why I trust the new primitive #
/api/oauth/usage got marked not planned three times. That endpoint isn’t coming back as a supported surface. The statusline JSON contract is the opposite case:
- It’s publicly documented.
- It ships in stable Claude Code releases (
rate_limitssince v1.2.80, per the codelynx.dev write-up). - Multiple ecosystem tools build on it — ccusage, claude-code-statusline, every custom statusline anyone’s written.
- Anthropic actively promotes the surface, including in their own examples.
If Anthropic ever ships a documented “consume usage data” REST endpoint or a webhook (the right long-term answer), the migration is a different feeder writing the same stateStore.rateLimits field. The daemon and plugin don’t change. Until then, I’m pulling from the channel they want me to pull from instead of the one they keep telling people not to.
Lessons #
- When you hit sustained 429 from an undocumented endpoint, the answer is rarely “tune the backoff.” It’s “find the supported primitive.” I almost shipped a beautiful
Retry-Afterparser to a surface that closes its issues asnot planned. - A GitHub issue closed as
not plannedis a load-bearing signal. That endpoint is not coming back to your use case. Read the closures before you build on top of it. - Statusline is a side channel in both directions. You can write to it (telemetry forwarding, which I’d already done) and you can read from it (rate limits, context window, session id, model). Same JSON, same per-turn cadence, both directions.
- Push beats pull for data the user is generating. If something fires every time the user does the action, subscribe to that. Polling a derived endpoint to learn about activity that just happened is the wrong shape.
- The fallback poller still needs to be correct. Honor
Retry-After, raise the floor, raise the cap. Just because it’s not the primary path doesn’t mean it gets to run on the old broken cadence.
The wider trap here is that “I have a daemon, daemons poll” was such a strong default that I never asked whether the data was already being pushed somewhere I owned. It was. It had been the whole time. The fix wasn’t a fix — it was a deletion of the question.
References #
- Claude Code statusline docs: https://code.claude.com/docs/en/statusline — schema source of truth for the JSON piped to a
statusLinecommand’s stdin (the olderdocs.claude.com/en/docs/claude-code/statuslineURL redirects here) - codelynx.dev — Claude Code usage limits via statusline, confirms
rate_limitsshipped in v1.2.80: https://codelynx.dev/posts/claude-code-usage-limits-statusline - anthropics/claude-code #31021 —
/api/oauth/usagebehavior, closednot planned: https://github.com/anthropics/claude-code/issues/31021 - anthropics/claude-code #31637 — 30+ minute bans on
/api/oauth/usage, closedinvalid: https://github.com/anthropics/claude-code/issues/31637 - anthropics/claude-code #30930 —
Retry-After: 0returned with rate-limit response, closed without fix: https://github.com/anthropics/claude-code/issues/30930 - AgentDeck
bridge/src/usage-api.ts— 120-second TTL “to avoid 429”, honorsRetry-After: https://github.com/puritysb/AgentDeck/blob/master/bridge/src/usage-api.ts - ohugonnot’s
claude-code-statusline— 300-second default, “undocumented endpoint” disclaimer in README: https://github.com/ohugonnot/claude-code-statusline - jens-duttke’s
usage-monitor-for-claude— 5/15/30-minute presets, no sub-5-minute option: https://github.com/jens-duttke/usage-monitor-for-claude - RFC 9110 §10.2.3 —
Retry-Afterheader semantics (delta-seconds OR HTTP-date): https://httpwg.org/specs/rfc9110.html#field.retry-after - ClaudeDeck statusline forwarder:
cli/src/statusline.sh— the per-turn POST that now also carriesrate_limits - ClaudeDeck poller with freshness gate:
daemon/src/planUsagePoller.ts—STATUSLINE_FRESH_MS = 10 * 60_000short-circuits the tick when statusline is fresh - ClaudeDeck SessionStart auto-patcher:
daemon/src/statuslineAutoPatch.ts— installs the forwarder into project-level settings on first session - Companion post on the same statusline primitive viewed from the write direction: The Claude Code statusline is a per-turn telemetry side channel