Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

FBS: Modeling Native Parallel Reading inside a Transformer

About

Large language models (LLMs) excel across many tasks, yet inference is still dominated by strictly token-by-token autoregression. Existing acceleration methods largely patch this pipeline and miss core human-reading ingredients: content-adaptive foresight, chunk-structure-aware compute allocation, and train--test consistency for preview/skimming. We propose the \textbf{Fovea-Block-Skip Transformer} (FBS), which injects a causal, trainable loop into Transformers via Parafovea-Attention Window (PAW), Chunk-Head (CH), and Skip-Gate (SG). Across diverse benchmarks, FBS improves the quality-efficiency trade-off without increasing parameters, and ablations show the three modules are complementary.

Tongxi Wang• 2026

Related benchmarks

TaskDatasetResultRank
Code GenerationHumanEval-X--
20
Chinese Language UnderstandingCMMLU (test)
CMMLU Score0.574
13
Language ModelingLanguage Modeling (test)
PPL6.2
7
Massive Multitask Language UnderstandingMMLU
MMLU56.6
7
Chinese Massive Multitask Language UnderstandingCMMLU
CMMLU Score57.4
2
Chinese Mathematical ReasoningCMath
CMath Score40.5
1
Comprehensive Chinese Transformer EvaluationC-Eval
C-Eval Score55.5
1
Mathematical ReasoningGSM8K
GSM8K Score39.4
1
Python ProgrammingMBPP
MBPP Score46.3
1
ReasoningBBH
BBH Score41.5
1
Showing 10 of 10 rows

Other info

Follow for update