Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
General Tool-Augmented LLM Capabilities Qualitative Comparison Survey
Loading...
-
Image Understanding Success
No plottable results for Image Understanding Success (SCALAR).
Metric
Image Understanding Success (SCALAR)
Browser Search Success (SCALAR)
Knowledge Retrieval Success (SCALAR)
Math Reasoning Success (SCALAR)
Table Understanding Success (SCALAR)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Image Understanding Success
Browser Search Success
Knowledge Retrieval Success
Math Reasoning Success
Table Understanding Success
No evaluation results found.
Feedback
Search any
task
Search any
task