Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SubQuad: Near-Quadratic-Free Structure Inference with Distribution-Balanced Objectives in Adaptive Receptor framework

About

Comparative analysis of adaptive immune repertoires at population scale is hampered by two practical bottlenecks: the near-quadratic cost of pairwise affinity evaluations and dataset imbalances that obscure clinically important minority clonotypes. We introduce SubQuad, an end-to-end pipeline that addresses these challenges by combining antigen-aware, near-subquadratic retrieval with GPU-accelerated affinity kernels, learned multimodal fusion, and fairness-constrained clustering. The system employs compact MinHash prefiltering to sharply reduce candidate comparisons, a differentiable gating module that adaptively weights complementary alignment and embedding channels on a per-pair basis, and an automated calibration routine that enforces proportional representation of rare antigen-specific subgroups. On large viral and tumor repertoires SubQuad achieves measured gains in throughput and peak memory usage while preserving or improving recall@k, cluster purity, and subgroup equity. By co-designing indexing, similarity fusion, and equity-aware objectives, SubQuad offers a scalable, bias-aware platform for repertoire mining and downstream translational tasks such as vaccine target prioritization and biomarker discovery.

Rong Fu, Zijian Zhang, Wenxin Zhang, Kun Liu, Jiekai Wu, Xianda Li, Simon Fong• 2026

Related benchmarks

TaskDatasetResultRank
T-Cell Receptor (TCR) Similarity SearchVDJdb random slices (10K sequences)
Recall (AUC)0.985
9
Rare Subpopulation RetrievalMcPAS-TCR database (test)
Recall@100 (Rare)0.594
3
TCR Antigen ClassificationVDJdb 2024.03 (test)
Macro-F1 (Antigen)71.2
3
Similarity search (affinity computation)10^7 sequences extrapolated from 10K-sequence component-level benchmark
Kernel throughput (k seq/s)89.4
2
Showing 4 of 4 rows

Other info

Follow for update