OpenAI Batch API with Go: Offline Batching, Failure Replay, and Cost Boundaries

Short answer: if your workload is delay-tolerant, batchable, and replay-safe, move it from online calls to Batch API. The savings are real, but only if you design splitting, failure routing, and replay first. Many teams treat Batch API as a cheaper sync endpoint. That usually creates a replay mess instead of stable savings. A conservative rollout starts with cost boundaries and SLOs, then implements offline batching and controlled replay. ...

March 13, 2026 · 3 min · mengboy

OpenAI Responses + Go: Taming Retry Storms with Idempotency Keys, Jittered Backoff, and Circuit Breakers

The most expensive outage is not a single failure — it is a failure amplified by retries. In an OpenAI Responses + Go tool-calling stack, missing idempotency, jittered backoff, and breaker thresholds can turn 10 failing requests into 1000 downstream calls in minutes. ...

March 4, 2026 · 3 min · mengboy