Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

AgentGC: Evolutionary Learning-based Lossless Compression for Genomics Data with LLM-driven Multiple Agent

About

Lossless compression has made significant advancements in Genomics Data (GD) storage, sharing and management. Current learning-based methods are non-evolvable with problems of low-level compression modeling, limited adaptability, and user-unfriendly interface. To this end, we propose AgentGC, the first evolutionary Agent-based GD Compressor, consisting of 3 layers with multi-agent named Leader and Worker. Specifically, the 1) User layer provides a user-friendly interface via Leader combined with LLM; 2) Cognitive layer, driven by the Leader, integrates LLM to consider joint optimization of algorithm-dataset-system, addressing the issues of low-level modeling and limited adaptability; and 3) Compression layer, headed by Worker, performs compression & decompression via a automated multi-knowledge learning-based compression framework. On top of AgentGC, we design 3 modes to support diverse scenarios: CP for compression-ratio priority, TP for throughput priority, and BM for balanced mode. Compared with 14 baselines on 9 datasets, the average compression ratios gains are 16.66%, 16.11%, and 16.33%, the throughput gains are 4.73x, 9.23x, and 9.15x, respectively.

Sun Hui, Ding Yanfeng, Huidong Ma, Chang Xu, Keyan Jin, Lizheng Zu, Cheng Zhong, xiaoguang Liu, Gang Wang, Wentong Cai• 2026

Related benchmarks

TaskDatasetResultRank
Lossless Genomics Data CompressionPlFa
Compression Ratio (bits/base)1.817
18
Lossless Genomics Data CompressionDrMe
Compression Ratio (bits/base)1.904
18
Lossless Genomics Data CompressionSnSt
Compression Ratio (bits/base)1.869
18
Lossless Genomics Data CompressionAcSc
Compression Ratio (bits/base)1.866
18
Lossless Genomics Data CompressionGenomics Dataset Suite Aggregate
Avg Compression Ratio (bits/base)1.85
18
Lossless Genomics Data CompressionWaMe
Compression Ratio (bits/base)1.95
18
Lossless Genomics Data CompressionGaGa
Compression Ratio (bits/base)1.859
18
Lossless Genomics Data CompressionMoGu
Compression Ratio (bits/base)1.65
18
Lossless Genomics Data CompressionArTh
Compression Ratio (bits/base)1.895
18
Lossless Genomics Data CompressionTaGu
Compression Ratio (bits/base)1.844
18
Showing 10 of 21 rows

Other info

Follow for update