LLM | Mengboy 技术笔记

RAG Accuracy Playbook: Retrieval Recall, Re-Ranking, and Evaluation Loop

If your RAG system feels unreliable, switching to a more expensive LLM is usually the wrong first move. In most cases, the bottleneck is retrieval quality: weak recall, poor ranking, and no measurement loop. This guide gives a practical path: make recall broader, make ranking sharper, then close the loop with offline + online evaluation. ...

RAG 不准怎么办：检索召回、重排与评估闭环落地指南

很多团队做 RAG 的第一反应是“把 embedding 换成更贵的模型”，结果成本上去了，效果却不稳定。真正的问题通常不在生成，而在检索链路：召回不全、排序不准、评估缺失。这篇给一套可直接落地的做法：先把召回做厚，再把重排做准，最后用离线 + 在线指标形成持续优化闭环。 ...

Software Engineering History: From Software Crisis to AI Co-Creation

Large language models are changing how we clarify requirements, generate code, and design tests, and many teams feel that traditional workflows are being rewritten. To understand what is truly changing, it helps to place today inside the longer history of software engineering. This article walks through the major stages of software engineering and ends with the AI-era variables and a simple checklist so you can map your current problems to the right time scale. ...