OpenAI Assistants/Responses 在 Go 里的上下文爆炸治理:截断策略、摘要回填与成本上限

线上 Agent 一跑久了就会遇到同一个坑:上下文越来越长,延迟飙升、费用失控,最后还更容易答偏。 这不是模型“变笨”了,通常是上下文治理没做:该留的没留、该删的没删、该摘要的摘要坏了。 ...

March 2, 2026 · 2 min · mengboy

Taming Context Explosion in OpenAI Assistants/Responses with Go: Truncation, Summary Backfill, and Cost Caps

Long-running agent sessions usually fail the same way: context keeps growing, latency spikes, costs blow up, and answer quality gets worse. That is rarely a model-quality issue. It is almost always missing context governance. ...

March 2, 2026 · 2 min · mengboy