Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Time-series reasoning on ECG-QA-CoT 1.0 (test)
Loading...
40.25
F1 Score
OpenTSLM Flamingo (Llama3.2-3B)
0.366
10.7205
21.075
31.4295
Oct 2, 2025
F1 Score
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
F1 Score
Accuracy
OpenTSLM Flamingo (Llama3.2-3B)
Category=OpenTSLM Flam...
2025.10
40.25
46.25
OpenTSLM Flamingo (Gemma3-1B-pt)
Category=OpenTSLM Flam...
2025.10
35.31
37.79
OpenTSLM Flamingo (Llama3.2-1B)
Category=OpenTSLM Flam...
2025.10
34.62
38.14
OpenTSLM SoftPrompt (Llama3.2-3B)
Category=OpenTSLM Soft...
2025.10
33.67
36.25
OpenTSLM SoftPrompt (Llama3.2-1B)
Category=OpenTSLM Soft...
2025.10
32.84
35.49
OpenTSLM Flamingo (Gemma3-270M)
Category=OpenTSLM Flam...
2025.10
32.71
35.5
OpenTSLM SoftPrompt (Gemma3-1B-pt)
Category=OpenTSLM Soft...
2025.10
27.86
34.76
Gemma3-4B FT
Category=Image (Plot),...
2025.10
26.17
38.1
GPT-4o
Category=Image (Plot),...
2025.10
24.95
33.3
GPT-4o
Category=Tokenized Tim...
2025.10
18.19
28.76
Random Baseline
Category=Random Baseline
2025.10
16.47
20.18
Gemma3-4B-pt
Category=Image (Plot),...
2025.10
1.9
1.03
Feedback
Search any
task
Search any
task