LLMmap: Fingerprinting For Large Language Models

About

We introduce LLMmap, a first-generation fingerprinting technique targeted at LLM-integrated applications. LLMmap employs an active fingerprinting approach, sending carefully crafted queries to the application and analyzing the responses to identify the specific LLM version in use. Our query selection is informed by domain expertise on how LLMs generate uniquely identifiable responses to thematically varied prompts. With as few as 8 interactions, LLMmap can accurately identify 42 different LLM versions with over 95% accuracy. More importantly, LLMmap is designed to be robust across different application layers, allowing it to identify LLM versions--whether open-source or proprietary--from various vendors, operating under various unknown system prompts, stochastic sampling hyperparameters, and even complex generation frameworks such as RAG or Chain-of-Thought. We discuss potential mitigations and demonstrate that, against resourceful adversaries, effective countermeasures may be challenging or even unrealizable.

Dario Pasquini, Evgenios M. Kornaropoulos, Giuseppe Ateniese• 2024

Related benchmarks

Task	Dataset	Result
Model Fingerprinting Robustness Evaluation	Pruning Robustness Evaluation Dataset	Similarity Score0.9585	127
Model Fingerprinting Robustness	Structured Pruning Suspects Sheared-Llama	Similarity Score94	42
Fingerprint Similarity	LLaMA2-7B	Similarity Score0.7459	24
Model Fingerprinting Robustness	Unstructured Pruning Suspects Llama-2-7b	Similarity Score91.45	21
LLM black-box fingerprinting	LLM instances	Query Efficiency (Verification)8	12
Model Fingerprinting	Qwen2.5-derived suspects v0.1	Similarity Score0.8231	12
Model Fingerprinting Detection	Mistral-7B black-box setting v0.3	True Positive Rate (TPR)74.8	10
Model Fingerprinting Robustness	FuseLLM 7b Distribution Merging Openllama-2-7b	Similarity Score57.42	7
Model Fingerprinting Robustness	Fusellm-7b Distribution Merging - Mpt-7b	Similarity Score24.13	7
Model Fingerprinting Robustness	CodeLLaMA 7B	Similarity Score0.9555	7

Showing 10 of 33 rows

Other info

Follow for update

@wizwand_team Discord