OpenAI Responses + Go Stream Recovery: Delta Persistence, Resume Tokens, and Duplicate Chunk Dedup

In production, the painful part is not “streaming is slow.” It’s “streaming breaks and then duplicates output after reconnect.” This guide gives you a practical recovery loop: delta persistence + resume token + idempotent dedup, so reconnection does not replay garbage. ...

March 23, 2026 · 4 min · mengboy

OpenAI Responses Structured Outputs + Go:Schema 演进、坏样本兜底与灰度回滚

Structured Outputs 最容易翻车的地方,不是“模型不听话”,而是你把 schema 当成了永远不变的圣旨。 线上一旦进入版本演进期,最常见的事故就是:字段新增后老消费端崩、枚举值扩展后校验误杀、坏样本把整条链路拖死,最后只能半夜回滚,像在给自己写惊悚片。 ...

March 11, 2026 · 4 min · mengboy

OpenAI Responses Structured Outputs with Go: Schema Evolution, Bad-Case Fallbacks, and Gradual Rollback

The hardest part of Structured Outputs is not getting JSON once. It is surviving schema changes without turning production into a small fire with excellent logs and terrible business results. Once a Go service starts evolving prompts and response contracts, the usual failure modes show up fast: a new required field breaks older consumers, an enum expands and strict validation kills valid requests, or one bad sample drags the whole chain into retries and rollback panic. ...

March 11, 2026 · 6 min · mengboy

OpenAI Responses + Go 工具调用重试风暴治理:幂等键、退避抖动与熔断阈值

线上最可怕的不是一次失败,而是失败后被重试放大。 在 OpenAI Responses + Go 的工具调用链路里,如果没有幂等键、退避抖动和熔断阈值,10 个请求很快就能打成 1000 个下游调用,账单和延迟一起爆炸。 ...

March 4, 2026 · 2 min · mengboy

OpenAI Responses + Go: Taming Retry Storms with Idempotency Keys, Jittered Backoff, and Circuit Breakers

The most expensive outage is not a single failure — it is a failure amplified by retries. In an OpenAI Responses + Go tool-calling stack, missing idempotency, jittered backoff, and breaker thresholds can turn 10 failing requests into 1000 downstream calls in minutes. ...

March 4, 2026 · 3 min · mengboy

Go Memory Leak Triage in Production: pprof + FlameGraph Step by Step

If your Go service RSS keeps climbing, drops after restart, then climbs again, you likely have a memory retention problem (or an actual leak pattern). Do not start with random code edits. Run a clean evidence chain: metrics trend check → pprof snapshots → FlameGraph comparison → object growth path → regression validation. ...

February 14, 2026 · 3 min · mengboy