Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Meeting claim evaluation on city_council & private_data overall (test)
Loading...
75.4
Accuracy
gpt-5.4
73.84
74.245
74.65
75.055
Apr 23, 2026
Accuracy
Completeness
Coverage
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Completeness
Coverage
gpt-5.4
Meetings=56, Evaluator...
2026.04
75.4
84.4
86.8
gpt-4.1
Meetings=56, Evaluator...
2026.04
73.9
78.9
79.4
Feedback
Search any
task
Search any
task