Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Report Generation on Mental Health Social Media Twitter
Loading...
3.9
Trajectory Coverage
GPT-traj
1.612
2.206
2.8
3.394
May 14, 2026
Trajectory Coverage
Temporal Coherence
Sensitivity to Change Points
Segment-Level Specificity
Overall Preference Score
Updated 19d ago
Evaluation Results
Method
Method
Links
Trajectory Coverage
Temporal Coherence
Sensitivity to Change Points
Segment-Level Specificity
Overall Preference Score
GPT-traj
Evaluator Model=GPT 5.2
2026.05
3.9
3.4
3.2
3.6
3.5
GPT-traj
Evaluator Model=Claude...
2026.05
3.9
3.6
3.4
3.8
3.6
GPT-traj
Evaluator Model=Gemini...
2026.05
3.8
3.5
3.3
3.5
3.4
GPT-traj
Evaluator Model=DeepSe...
2026.05
3.3
3.6
2.9
3.7
3.6
Baseline
Evaluator Model=Gemini...
2026.05
2.7
3
2.2
2.9
2.6
Baseline
Evaluator Model=GPT 5.2
2026.05
2.5
2.8
1.9
2.6
2.6
Baseline
Evaluator Model=Claude...
2026.05
2.4
2.7
2.1
2.4
2.4
Baseline
Evaluator Model=DeepSe...
2026.05
1.7
2
1.3
1.3
1.8
Feedback
Search any
task
Search any
task