Skip to main content
  1. Posts/

I polled an undocumented endpoint for 18 hours. The data was on stdin.

Nick Liu
Author
Nick Liu
Building infrastructure for Facebook Feed Ranking at Meta. Previously at Walmart, Twitter, AWS, and eBay. MS in Computer Science at Georgia Tech.
Table of Contents
My daemon logged 111 consecutive HTTP 429s against `https://api.anthropic.com/api/oauth/usage` over an 18-hour stretch, with zero successful responses ever in its lifetime. The poller was reading `Retry-After: 272` and ignoring it. While I was arguing with the backoff, Claude Code was pushing the same `rate_limits.five_hour` and `rate_limits.seven_day` numbers to my statusline command every turn, on stdin, for free.

(Quick framing: Claude Code is Anthropic’s terminal CLI for Claude; Claude Max is the higher-tier subscription plan with weekly and 5-hour usage windows. HTTP 429 is “Too Many Requests” — the server’s polite way of saying “back off.” Retry-After is the response header that tells the client how long to wait. OAuth is the auth protocol Claude Code uses to talk to Anthropic on behalf of a logged-in user. And the statusline — the same one I covered in the statusline side-channel post — is the script Claude Code spawns every turn with a JSON blob on stdin.)

I built the wrong thing. Not in the “shipped it and it was slow” sense — in the “the API I was polling was never meant to be polled” sense. The piece of the system I needed had been sitting on /dev/stdin of a script I’d already installed, on every Claude turn, for the whole 18 hours I was watching the daemon log fill up with 429s.

This is the post-mortem.

Symptom: a banlist that never thawed
#

ClaudeDeck has a Stream Deck key that shows my Claude Max plan usage — the 5-hour block and the 7-day weekly window. Two percentages, a small countdown to the next reset, color-graded so I can tell from across the room whether I’m about to get rate-limited mid-flow.

The first version was a polling loop. Every 60 seconds, the daemon hit https://api.anthropic.com/api/oauth/usage with the same OAuth token Claude Code uses, parsed the response, and pushed the numbers over WebSocket to the Stream Deck plugin. Standard background-poller shape.

It never worked. Not once. Over 18 hours and change, the log line plan:error reason=429 repeated 111 times and plan:update never appeared. The token wasn’t expired (an expired token would be 401, not 429). The cadence was getting hammered by MAX_BACKOFF_MS = 10 * 60_000, which meant the worst case was one request per ten minutes — call it ~108 requests per ban window. Every single one a 429.

When I finally curled the endpoint by hand to see what was coming back, Anthropic was being polite about it (those cf-ray/server: cloudflare headers mean the request is being rejected at Cloudflare’s edge, not the origin — Anthropic puts Cloudflare in front of the API):

HTTP/2 429
date: Mon, 11 May 2026 01:55:01 GMT
retry-after: 272
content-type: application/json
cf-ray: 9f9d8f1e6db975ec-SEA
server: cloudflare

{"error": {"type": "rate_limit_error",
           "message": "Rate limited. Please try again later."}}

Retry-After: 272. Wait four and a half minutes, try again. My poller wasn’t reading that header at all. The error path in daemon/src/claudeAiFetcher.ts was returning a string, discarding the whole Response, and the backoff math in planUsagePoller.ts was running on its own 2^N schedule decoupled from what Anthropic was actually asking.

So I sat down to write a parseRetryAfter function. RFC 9110 (the current HTTP/1.1 semantics spec — the document that defines what a header like Retry-After actually means) says the header is either delta-seconds (an integer like 272) or an HTTP-date (a full timestamp like Wed, 21 Oct 2026 07:28:00 GMT); both shapes need to work. A kind: "rate_limited"; retryAfterMs variant for the fetcher to return. New tests for the integer path and the IMF-fixdate path (IMF-fixdate is the RFC 9110 name for the fixed-width “day, DD Mon YYYY HH:MM:SS GMT” date format — the only date shape modern HTTP allows) and the “garbage header” fallback. Wire updateCadence to honor it. Maybe two or three hours of careful work.

This was the wrong fix. None of it would have helped.

Investigation: what was I even polling?
#

Before I shipped a careful Retry-After parser, I checked whether the endpoint I was being so careful with was actually one I was supposed to be hitting.

/api/oauth/usage is not in Anthropic’s public REST API docs. It’s not in the Messages API (Anthropic’s main public endpoint — the one SDKs and most apps use), the OAuth flow docs, the workspace billing endpoints — none of them. The community knows about it because claude.ai’s web UI hits it to populate the little usage panel (you can confirm this yourself by opening DevTools’ Network tab on claude.ai). A handful of third-party tools — AgentDeck’s bridge, ohugonnot’s claude-code-statusline shell script, jens-duttke’s usage-monitor-for-claude — poll it from background processes, with care.

How much care? AgentDeck’s bridge/src/usage-api.ts runs a 120-second file-cache TTL, with an explicit comment saying it was raised from 60 seconds “to avoid 429”. ohugonnot’s tool defaults to 300 seconds and warns against setting it lower. jens-duttke’s tool offers 5-minute, 15-minute, and 30-minute presets and refuses to let you pick less. Industry consensus on the endpoint, among the three independent tools polling it, was somewhere between 2 and 5 minutes. Mine was hitting it at 60.

That fixed the obvious problem — the poller was 2–5× too aggressive — but didn’t explain the 100% 429 rate. Sixty seconds is loud, not radioactive. Three other repos were polling at similar cadences without continuous bans.

Then I went looking at the issue tracker. Specifically:

Three separate issues against this endpoint, three “not our problem” closures. That’s not a triage backlog. That’s a signal: the surface isn’t being supported. It’s the URL claude.ai’s React app hits on page load to show you the little panel, and Anthropic is fine if it occasionally falls over because the web UI’s failure mode is “panel is blank for thirty seconds.” A daemon doing 60-second polls is not a use case they care to keep working.

The poller wasn’t broken. The endpoint was the wrong primitive.

Root cause: the data was already pushed, on stdin, every turn
#

While I was reading those closed issues, I tabbed over to Claude Code’s statusline docs for an unrelated reason — I was already using the statusline command as a per-turn telemetry side channel for context-window data — and noticed a field I’d never registered before.

The JSON Claude Code pipes to your statusline command’s stdin, per turn, includes this:

{
  "session_id": "01J9...",
  "model": { "id": "claude-opus-4-7", "display_name": "Opus 4.7" },
  "workspace": { "current_dir": "/Users/nick/repo" },
  "rate_limits": {
    "five_hour": {
      "used_percentage": 51,
      "resets_at": 1746920100
    },
    "seven_day": {
      "used_percentage": 25,
      "resets_at": 1747526400
    }
  },
  "context_window": {
    "context_window_size": 1000000,
    "used_percentage": 17,
    "total_input_tokens": 174903
  }
}

rate_limits.five_hour.used_percentage. rate_limits.seven_day.used_percentage. resets_at Unix timestamps. The exact two numbers and two countdowns my Stream Deck key wanted to render. Pushed to a shell script I’d already installed. Every turn. No HTTP, no token, no Cloudflare. The codelynx.dev write-up confirmed it shipped in Claude Code v1.2.80, which had been my installed version for weeks.

Read that again. The data was being shoved at me, push-style, by a documented contract, while I was busy tuning the backoff on an undocumented endpoint that returns the same data and bans you for asking.

The statusline command I’d already shipped — the one I’d written a whole separate post about as a telemetry forwarder — was already getting this JSON. I was reading context_window.used_percentage out of it and ignoring the rate_limits field two keys away.

That was the moment the design changed.

Fix: subscribe, don’t poll
#

The new architecture is one extra path in the statusline forwarder and a few fields in the daemon’s state store.

The statusline command runs every turn and POSTs the JSON to a daemon endpoint:

# cli/src/statusline.sh — abridged
INPUT="$(cat)"
SID="$(printf '%s' "$INPUT" | jq -r '.session_id // empty')"
if [ -n "$SID" ]; then
  printf '%s' "$INPUT" \
    | curl -sf -m 1 -X POST "http://127.0.0.1:9127/context/$SID" \
        -H "Content-Type: application/json" \
        --data-binary @- > /dev/null 2>&1 &
fi

The daemon receives that POST, parses rate_limits.* along with the context_window.* it was already consuming, updates stateStore.rateLimits and stateStore.lastStatuslineAt, and broadcasts. The Stream Deck plugin reads rateLimits.five_hour.used_percentage from the WebSocket state and renders. End-to-end latency is a few milliseconds. The poller doesn’t run.

The poller doesn’t go away, though — it gates on freshness. When lastStatuslineAt is recent, the tick short-circuits before hitting Anthropic. When no Claude session has been active for ten minutes, the poller resumes covering the display, at a much saner cadence (5-minute base, 30-minute cap) and now actually honoring Retry-After. The freshness gate is the load-bearing change; the poller’s job shrank from “primary data source” to “fallback when nobody’s running Claude”:

// daemon/src/planUsagePoller.ts — the gate
private static readonly STATUSLINE_FRESH_MS = 10 * 60_000;

const lastStatusline = this.getLastStatuslineAt();
if (lastStatusline > 0) {
  const age = this.clock() - lastStatusline;
  if (age < PlanUsagePoller.STATUSLINE_FRESH_MS) {
    console.log(`plan poll skipped — statusline fresh (${age}ms ago)`);
    return;
  }
}

Net result, looking at daemon.log on a regular workday: zero polls, zero 429s. Every turn through Claude refreshes the numbers, fresher than the polled endpoint ever was, with no rate-limit exposure. The Stream Deck key is now noticeably more responsive — the 5-hour bucket ticks up during the same turn that crossed the threshold, not on the next 60-second poll.

The endpoint I’d been hammering for 18 hours is still in the codebase, behind the gate, polling at five minutes when there’s no Claude session — exactly the cadence the other three tools converged on. It’s the fallback path now, not the primary one.

Why I trust the new primitive
#

/api/oauth/usage got marked not planned three times. That endpoint isn’t coming back as a supported surface. The statusline JSON contract is the opposite case:

  • It’s publicly documented.
  • It ships in stable Claude Code releases (rate_limits since v1.2.80, per the codelynx.dev write-up).
  • Multiple ecosystem tools build on it — ccusage, claude-code-statusline, every custom statusline anyone’s written.
  • Anthropic actively promotes the surface, including in their own examples.

If Anthropic ever ships a documented “consume usage data” REST endpoint or a webhook (the right long-term answer), the migration is a different feeder writing the same stateStore.rateLimits field. The daemon and plugin don’t change. Until then, I’m pulling from the channel they want me to pull from instead of the one they keep telling people not to.

Lessons
#

  • When you hit sustained 429 from an undocumented endpoint, the answer is rarely “tune the backoff.” It’s “find the supported primitive.” I almost shipped a beautiful Retry-After parser to a surface that closes its issues as not planned.
  • A GitHub issue closed as not planned is a load-bearing signal. That endpoint is not coming back to your use case. Read the closures before you build on top of it.
  • Statusline is a side channel in both directions. You can write to it (telemetry forwarding, which I’d already done) and you can read from it (rate limits, context window, session id, model). Same JSON, same per-turn cadence, both directions.
  • Push beats pull for data the user is generating. If something fires every time the user does the action, subscribe to that. Polling a derived endpoint to learn about activity that just happened is the wrong shape.
  • The fallback poller still needs to be correct. Honor Retry-After, raise the floor, raise the cap. Just because it’s not the primary path doesn’t mean it gets to run on the old broken cadence.

The wider trap here is that “I have a daemon, daemons poll” was such a strong default that I never asked whether the data was already being pushed somewhere I owned. It was. It had been the whole time. The fix wasn’t a fix — it was a deletion of the question.

References
#

Related

The Claude Code statusline is a per-turn telemetry side channel

Claude Code calls a custom statusline command every turn with a JSON payload on stdin. The payload includes the current context-window fill percentage, model, cost, and cwd. Nothing in the contract says you can only read it — you can fork it to anything you want, and the command stays a statusline. (Quick framing for anyone new to Claude Code: it’s Anthropic’s terminal CLI for Claude, and the statusline is the configurable line of text it prints under your prompt every turn — like a shell prompt for the agent. You point at any script in settings.local.json, Claude pipes a JSON object to it on stdin, and whatever the script writes to stdout becomes the visible line.)

Holding HTTP open for 590 seconds so a Stream Deck key can approve a tool call

Claude Code wants to run a shell command. I want to press a physical Stream Deck key — the YES key, two inches to the left of my keyboard — to approve it. The hook gets exactly one HTTP response to decide allow vs deny. The key press might land in 200 milliseconds; it might land seven minutes later, after I've been pulled into a meeting and come back. The trick is that Claude Code's hook timeout is 600 seconds, which turns out to be just enough headroom to hold the HTTP response open the whole time and let a hardware button write the answer. (Setup, for anyone who hasn’t seen this stack before: Claude Code is Anthropic’s terminal CLI for Claude, and one of its hook events — PreToolUse — is a script Claude spawns and waits on before running a tool like Bash or Edit. The script’s stdout decides “allow” / “deny” / “ask”. Stream Deck is Elgato’s USB grid of programmable LCD keys. The plumbing I’m describing here lives in a daemon — a background process at 127.0.0.1:9127 — that the hook script POSTs to and that the Stream Deck plugin connects to over WebSocket. For the hooks docs themselves and the four other gotchas in that layer, see the hooks-reality post.)

I split my daemon in two so a Node subprocess could own the PTY

I built a Claude Code permission gate that holds an HTTP response open until a Stream Deck key is pressed. Then I needed to inject a keystroke into Claude Code's own TTY so a key press could write `1\r` straight into Claude's stdin. Bun can hold HTTP open all day. Bun cannot reliably wrap a child PTY through `node-pty` and capture the parent shell's PID. So I split my daemon: HTTP and WebSocket stay on Bun, and a Node CommonJS subprocess owns the PTY that runs Claude. (Quick grounding before the story: a PTY — pseudo-terminal — is the kernel object every interactive shell talks to. It’s a pair of file descriptors, master and slave; the program reads/writes the slave end as if it were a real terminal, and anything you write to the master end looks to that program like a human typing. The TTY is the slave end seen from the child’s side. node-pty is Microsoft’s library that gives a JavaScript parent process a writable handle to the master. Bun is a JavaScript runtime — Node’s faster sibling — and Node CommonJS is plain old require()-based Node, no transpile step. The story below is about which runtime owns the PTY.)