AI Engineering

Go + OpenAI Responses Agent Memory Layering: Short-Term Context, Long-Term Index, and Cost Caps

In production Go agents, the first thing that breaks is usually not model quality. It is memory management: context grows, bills spike, and answers drift. Use a 3-layer memory design: L1: short-term conversational window (seconds) L2: rolling summary (minutes) L3: long-term retrieval memory (days) ...

Go + OpenAI Responses Agent 记忆分层实战：短期上下文、长期索引与成本封顶

你在 Go 里做 Agent，最容易翻车的不是推理能力，而是“记忆”失控：上下文越来越长、账单越来越高、回答却越来越飘。这篇给你一个可落地的三层方案： L1：短期会话上下文（秒级，强相关） L2：中期摘要记忆（分钟级，压缩） L3：长期检索记忆（天级，向量索引） ...

Go 服务调用 OpenAI 的 429/5xx 风暴应对：令牌桶、指数退避与熔断恢复

你不是被 OpenAI API「偶尔报错」打败的；你是被并发放大后的重试风暴打败的。 ...

Handling OpenAI 429/5xx Storms in Go: Token Bucket, Exponential Backoff, and Circuit Breakers

Most Go teams are not killed by a single API error. They are killed by a retry storm they created themselves. ...

OpenAI Responses + GitHub Actions PR Risk Gate: Automated Evals, Tiered Blocking, and One-Click Rollback

You don’t need an AI reviewer that “sounds smart.” You need a gate that stops risky PRs before they hit main. This post shows a production-ready minimum setup: OpenAI Responses generates structured risk output, GitHub Actions enforces tiered policies, and critical failures can trigger a one-click rollback. ...

OpenAI Responses + GitHub Actions 的 PR 风险闸门：自动评测、分级阻断与一键回滚

你不需要一个“会聊天”的 AI 审查器，你需要一个能阻断坏改动进主干的风险闸门。这篇给一套可上线的最小方案：OpenAI Responses 负责生成结构化审查结论，GitHub Actions 负责分级阻断，发现高风险时自动回滚到安全提交。 ...

OpenAI Batch API + Go 降本实战：离线拆批、失败重放与成本边界

一句话结论：如果你的调用是可延迟、可批处理、可回放，就该把在线请求下沉到 Batch API；省钱最明显，但前提是你把拆批、失败分流和回放链路先做好。很多团队把 Batch API 当“便宜版同步接口”来用，结果不是省钱，而是把失败样本堆成事故池。真正的保守做法是：先定义成本边界和SLO，再做离线拆批与失败回放。 ...

OpenAI Batch API with Go: Offline Batching, Failure Replay, and Cost Boundaries

Short answer: if your workload is delay-tolerant, batchable, and replay-safe, move it from online calls to Batch API. The savings are real, but only if you design splitting, failure routing, and replay first. Many teams treat Batch API as a cheaper sync endpoint. That usually creates a replay mess instead of stable savings. A conservative rollout starts with cost boundaries and SLOs, then implements offline batching and controlled replay. ...

OpenAI Responses Structured Outputs + Go：Schema 演进、坏样本兜底与灰度回滚

Structured Outputs 最容易翻车的地方，不是“模型不听话”，而是你把 schema 当成了永远不变的圣旨。线上一旦进入版本演进期，最常见的事故就是：字段新增后老消费端崩、枚举值扩展后校验误杀、坏样本把整条链路拖死，最后只能半夜回滚，像在给自己写惊悚片。 ...

OpenAI Responses Structured Outputs with Go: Schema Evolution, Bad-Case Fallbacks, and Gradual Rollback

The hardest part of Structured Outputs is not getting JSON once. It is surviving schema changes without turning production into a small fire with excellent logs and terrible business results. Once a Go service starts evolving prompts and response contracts, the usual failure modes show up fast: a new required field breaks older consumers, an enum expands and strict validation kills valid requests, or one bad sample drags the whole chain into retries and rollback panic. ...