Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Behavior Abstraction Detection on DARPA TC AppStarter
Loading...
100
Precision
Extractor
93.968
95.534
97.1
98.666
Jun 20, 2025
Precision
Recall
F1 Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Precision
Recall
F1 Score
Extractor
Method=Extractor
2025.06
100
88
93.6
SmartGuard
Backbone=OPT-1.3b
2025.06
97.8
96.1
96.9
SmartGuard
Backbone=LLaMa2-3b
2025.06
94.2
97.4
96.8
Feedback
Search any
task
Search any
task