Taming Context Explosion in OpenAI Assistants/Responses with Go: Truncation, Summary Backfill, and Cost Caps

Long-running agent sessions usually fail the same way: context keeps growing, latency spikes, costs blow up, and answer quality gets worse. That is rarely a model-quality issue. It is almost always missing context governance. ...

March 2, 2026 · 2 min · mengboy

Go + OpenAI API Timeout Troubleshooting: DNS, TLS, Proxy, and Connection Pool

When OpenAI API calls start timing out in production, the real problem is usually not “OpenAI is down.” The real problem is you don’t know which hop is failing: DNS, TLS handshake, proxy path, or your own connection pool. ...

March 2, 2026 · 2 min · mengboy

OpenAI Agents SDK with Go: Tool Calling, Session Memory, and Error Recovery

Most teams can connect an LLM in a demo. The real pain starts in production: multi-step tasks, flaky tool calls, unclear retries, and rising cost. This guide gives you a pragmatic Go-first blueprint for shipping an Agent workflow that can survive real incidents. ...

February 25, 2026 · 3 min · mengboy

OpenAI Responses API Streaming in Go: Timeouts, Retries, and Observability

Production streaming fails in two predictable ways: users wait while the stream silently drops, and your logs say “timeout” without telling you where it actually broke. This guide gives you a practical Go pattern for OpenAI Responses API streaming with strict timeout boundaries, safe retries, and useful telemetry. ...

February 23, 2026 · 2 min · mengboy

Go Memory Leak Triage in Production: pprof + FlameGraph Step by Step

If your Go service RSS keeps climbing, drops after restart, then climbs again, you likely have a memory retention problem (or an actual leak pattern). Do not start with random code edits. Run a clean evidence chain: metrics trend check → pprof snapshots → FlameGraph comparison → object growth path → regression validation. ...

February 14, 2026 · 3 min · mengboy