Primus: A Pioneering Collection of Open-Source Datasets for Cybersecurity LLM Training
About
Large Language Models (LLMs) have shown remarkable advancements in specialized fields such as finance, law, and medicine. However, in cybersecurity, we have noticed a lack of open-source datasets, with a particular lack of high-quality cybersecurity pretraining corpora, even though much research indicates that LLMs acquire their knowledge during pretraining. To address this, we present a comprehensive suite of datasets covering all major training stages, including pretraining, instruction fine-tuning, and reasoning distillation with cybersecurity-specific self-reflection data. Extensive ablation studies demonstrate their effectiveness on public cybersecurity benchmarks. In particular, continual pre-training on our dataset yields a 15.9% improvement in the aggregate score, while reasoning distillation leads to a 15.8% gain in security certification (CISSP). We will release all datasets and trained cybersecurity LLMs under the ODC-BY and MIT licenses to encourage further research in the community. For access to all datasets and model weights, please refer to https://huggingface.co/collections/trendmicro-ailab/primus-67b1fd27052b802b4af9d243.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Cybersecurity Knowledge Question Answering | MMLU CSec | CSec Score79 | 17 | |
| Cybersecurity Evaluation | ScEva | MCQ Score61.15 | 17 | |
| Cybersecurity Knowledge and Malware Extraction Analysis | SECURE | KCV82.65 | 17 | |
| General Language Understanding and Reasoning | Open LLM Leaderboard Lighteval (test) | Mean Accuracy66.71 | 17 | |
| Cybersecurity Knowledge Evaluation | CyMtc (500) | CyMtc (500) Score83.8 | 17 | |
| Cybersecurity Benchmarking | ScBen En | En64.91 | 17 | |
| Cybersecurity Multiple Choice Question Answering | RedSage-MCQ 0-shot (test) | Macro Accuracy77.02 | 17 | |
| Overall Cybersecurity Performance | Cybersecurity Multi-Benchmark Suite | Overall Mean Score71.69 | 17 | |
| Cybersecurity Threat Intelligence Analysis | CTI-Bench | MCQ Score55.92 | 17 | |
| Cybersecurity Evaluation | Cybersecurity Benchmarks CTI-MCQ, CyberMetric, SecEval (test) | CTI-MCQ Score66.6 | 5 |