JT-SAFE-V2: Safety-by-Design Foundation Model with World-Context Data
About
We introduce JT-Safe-V2, a large language model designed to advance the safety and trustworthiness of foundation models, extending our previous JT-Safe model toward a more comprehensive safety-by-design paradigm. JT-Safe-V2 emphasizes the joint optimization of general intelligence and safety-by-design through several key innovations: enriching pre-training data with contextual world knowledge, high-certainty pre-training procedures, and safety strengthening post-training mechanisms for enterprise-oriented agentic capabilities. Building on these safety-enhanced foundation models, we propose Safe-MoMA (Safe Mixture of Models and Agents), a framework that enables traceable and efficient inference through the orchestrated deployment of multiple models and agents. Extensive evaluations demonstrate that JT-Safe-V2 achieves state-of-the-art performance across both general intelligence and safety benchmarks. Moreover, Safe-MoMA reduces inference costs by more than 30\% compared to using the largest standalone model baseline while maintaining comparable performance. To facilitate future research on safety-by-design foundation models, we publicly release the post-trained JT-Safe-V2-35B model checkpoint.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Bias Evaluation | BBQ | -- | 171 | |
| Coding | HumanEval+ | Pass@190.24 | 164 | |
| Reasoning | PIQA | Accuracy84.22 | 164 | |
| Mathematics | GSM8K | GSM8K Score94.62 | 87 | |
| Reasoning | StrategyQA | Accuracy83.19 | 52 | |
| Agent | τ2-bench | Accuracy70 | 41 | |
| Coding | MultiPL-E | Score82.47 | 31 | |
| Long Context | MRCR | Score21.6 | 25 | |
| Financial Knowledge | Fineval | AVG Score78.89 | 16 | |
| Coding | HumanEval | Score96.3 | 13 |