Sabi\'a-4 Technical Report

About

This technical report presents Sabi\'a-4 and Sabiazinho-4, a new generation of Portuguese language models with a focus on Brazilian Portuguese language. The models were developed through a four-stage training pipeline: continued pre-training on Portuguese and Brazilian legal corpora, long-context extension to 128K tokens, supervised fine-tuning on instruction data spanning chat, code, legal tasks, and function calling, and preference alignment. We evaluate the models on six benchmark categories: conversational capabilities in Brazilian Portuguese, knowledge of Brazilian legislation, long-context understanding, instruction following, standardized exams, and agentic capabilities including tool use and web navigation. Results show that Sabi\'a-4 and Sabiazinho-4 achieve a favorable cost-performance trade-off compared to other models, positioning them in the upper-left region of the pricing-accuracy chart. The models show improvements over previous generations in legal document drafting, multi-turn dialogue quality, and agentic task completion.

Thiago Laitz, Thales Sales Almeida, Hugo Abonizio, Roseval Malaquias Junior, Giovana Kerche Bon\'as, Marcos Piau, Celio Larcher, Ramon Pires, Rodrigo Nogueira• 2026

Related benchmarks

Task	Dataset	Result
Multiple-choice Question Answering	EXAMS	Accuracy86.6	29
Legal Knowledge Assessment	Laws	Accuracy97.4	15
Conversation Evaluation	Braceval	Accuracy66	15
Lawyer Evaluation	OAB-Bench	Score7.49	15
Judge Evaluation	Magis-Bench	Score5.08	15
Agentic Task Performance	Agent Capabilities	Success Rate72.2	15
Instruction Following	Multi-IF PT	Accuracy82	15

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord