Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Interactive Question Answering on IQA-EVAL MMLU-derived (TextBabbage)
Loading...
3.87
Helpfulness
IQA-EVAL-GPT3.5
2.2372
2.6611
3.085
3.5089
Aug 24, 2024
Helpfulness
Fluency
Avg Queries
Accuracy
Updated 5d ago
Evaluation Results
Method
Method
Links
Helpfulness
Fluency
Avg Queries
Accuracy
IQA-EVAL-GPT3.5
Evaluator Backbone=GPT...
2024.08
3.87
3.67
1.77
47
Human
2024.08
3.84
3.84
2.57
52
IQA-EVAL-Claude
Evaluator Backbone=Claude
2024.08
3.03
3.47
2.67
53
IQA-EVAL-GPT4
Evaluator Backbone=GPT-4
2024.08
2.3
3.87
2.27
83
Feedback
Search any
task
Search any
task