#performance
2 posts tagged with "performance".
-
Cost and Latency Engineering for Agent Systems
• 7 min readAgents bill non-linearly. The patterns that matter — prompt caching, tiered model routing, parallel tool calls, retrieval budgets — and the dashboards that catch waste before it ships.
-
Cost-Optimized Agent Architectures: Cutting Spend 10x Without Losing Quality
• 9 min readCaching, routing, distillation, and per-task model selection. The four moves that take a $0.40/task agent to $0.04/task without anyone noticing the difference.