Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MergePrint: Merge-Resistant Fingerprints for Robust Black-box Ownership Verification of Large Language Models

About

Protecting the intellectual property of Large Language Models (LLMs) has become increasingly critical due to the high cost of training. Model merging, which integrates multiple expert models into a single multi-task model, introduces a novel risk of unauthorized use of LLMs due to its efficient merging process. While fingerprinting techniques have been proposed for verifying model ownership, their resistance to model merging remains unexplored. To address this gap, we propose a novel fingerprinting method, MergePrint, which embeds robust fingerprints capable of surviving model merging. MergePrint enables black-box ownership verification, where owners only need to check if a model produces target outputs for specific fingerprint inputs, without accessing model weights or intermediate outputs. By optimizing against a pseudo-merged model that simulates merged behavior, MergePrint ensures fingerprints that remain detectable after merging. Additionally, to minimize performance degradation, we pre-optimize the fingerprint inputs. MergePrint pioneers a practical solution for black-box ownership verification, protecting LLMs from misappropriation via merging, while also excelling in resistance to broader model theft threats.

Shojiro Yamabe, Futa Waseda, Tsubasa Takahashi, Koki Wataoka• 2024

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Math Score42
171
Mathematical ReasoningMGSM
Accuracy42
114
Safety EvaluationToxigen
Safety54
71
Fingerprint RemovalLLM Fingerprinting Evaluation Alpaca-GPT4-52k
ASR Error Rate0.00e+0
66
Safety EvaluationLLaMA-2-7B-CHAT Safety (test)
Safety Score0.54
60
Japanese Language UnderstandingJAQKET
Japanese Score77
60
Fingerprint VerificationShisa-7B and Abel-7B-002 Merged
VSR1
60
Fingerprint VerificationFingerprint Verification
VSR100
60
Mathematical ReasoningWizardMath (test)
Math Score42
60
Fingerprint VerificationEmbedded Fingerprints (test)
VSR1
60
Showing 10 of 19 rows

Other info

Follow for update