Articles tagged “evaluation”
2 articles

Knowledge & Memory·12 min read read
Your Agent Completed the Task. It Also Forgot 87% of What It Knew.
Task completion hides a silent failure: agents forget 87% of stored knowledge under complexity. New research reveals why standard evals miss this entirely.
Read More

Operations·15 min read read
74% of Production Agents Still Rely on Human Evaluation
A survey of 306 practitioners reveals most production agents are far simpler than expected. The eval gap isn't a tooling problem. It's a trust problem.
Read More
Aprende IA Agéntica
Una lección por semana: técnicas prácticas para construir, probar y lanzar agentes IA. Desde ingeniería de prompts hasta monitoreo en producción. Aprende haciendo.