Running Evaluations¶
How to run the evaluation suite and interpret results.
Basic Usage¶
Command Options¶
Example Output¶
✅ Evaluation completed
├── Test cases: 15
├── Passed: 14
├── Failed: 1
└── Pass rate: 93.3%
Metrics:
├── Mention Precision: 95.2%
├── Mention Recall: 91.8%
├── Rank Accuracy: 88.5%
└── F1 Score: 93.5%
See Metrics for metric definitions.