Skip to main content

One post tagged with "v0.97.0"

View All Tags
v0.97.0

Annotation Queues

The most useful thing you can do when building an LLM app is read your traces. You find the failures, label what went wrong, and turn the worst ones into test cases. That loop used to happen in spreadsheets. Annotation queues bring it into Agenta.

Build a queue from traces or test set rows, attach a scoring schema (ratings, dropdowns, rubrics, or free text), and route it to reviewers. When the queue is done, export it as a labeled test set. The annotations come along as columns, so the work feeds straight into your evaluators.

Read more →