Go + OpenAI Responses Agent Memory Layering: Short-Term Context, Long-Term Index, and Cost Caps

In production Go agents, the first thing that breaks is usually not model quality. It is memory management: context grows, bills spike, and answers drift. Use a 3-layer memory design: L1: short-term conversational window (seconds) L2: rolling summary (minutes) L3: long-term retrieval memory (days) ...

March 18, 2026 · 3 min · mengboy

Go + OpenAI Responses Agent 记忆分层实战:短期上下文、长期索引与成本封顶

你在 Go 里做 Agent,最容易翻车的不是推理能力,而是“记忆”失控:上下文越来越长、账单越来越高、回答却越来越飘。 这篇给你一个可落地的三层方案: L1:短期会话上下文(秒级,强相关) L2:中期摘要记忆(分钟级,压缩) L3:长期检索记忆(天级,向量索引) ...

March 18, 2026 · 3 min · mengboy

OpenAI Responses API + MCP in Practice: From Function Calling to Agent Workflows

If you’ve already used function calling but keep writing glue code for every non-trivial task, you’re likely at the point where Responses API + MCP makes more sense. This guide is practical: how to move from single tool calls to a scalable agent workflow where retrieval, execution, validation, and write-back follow a consistent structure. ...

February 11, 2026 · 3 min · mengboy

OpenAI Responses API + MCP 实战:从函数调用到 Agent 工作流

如果你已经做过函数调用(function calling),但一上复杂流程就开始写一堆胶水代码,那你差不多到了该用 Responses API + MCP 的阶段。 这篇不讲空概念,直接给你一个可落地的路线:把“模型调用工具”升级成“可扩展 Agent 工作流”,让检索、执行、校验、回写变成标准流程。 ...

February 11, 2026 · 3 min · mengboy