HyperNet-Adaptation for Diffusion-Based Test Case Generation

About

The increasing deployment of deep learning systems requires systematic evaluation of their reliability in real-world scenarios. Traditional gradient-based adversarial attacks introduce small perturbations that rarely correspond to realistic failures and mainly assess robustness rather than functional behavior. Generative test generation methods offer an alternative but are often limited to simple datasets or constrained input domains. Although diffusion models enable high-fidelity image synthesis, their computational cost and limited controllability restrict their applicability to large-scale testing. We present HyNeA, a generative testing method that enables direct and efficient control over diffusion-based generation. HyNeA provides dataset-free controllability through hypernetworks, allowing targeted manipulation of the generative process without relying on architecture-specific conditioning mechanisms or dataset-driven adaptations such as fine-tuning. HyNeA employs a distinct training strategy that supports instance-level tuning to identify failure-inducing test cases without requiring datasets that explicitly contain examples of similar failures. This approach enables the targeted generation of realistic failure cases at substantially lower computational cost than search-based methods. Experimental results show that HyNeA improves controllability and test diversity compared to existing generative test generators and generalizes to domains where failure-labeled training data is unavailable.

Oliver Wei{\ss}l, Vincenzo Riccio, Severin Kacianka, Andrea Stocco• 2026

Related benchmarks

Task	Dataset	Result
Binary Classification	ImageNet	Runtime (s)94.41	4
Binary Classification	CelebA	Runtime (sec)220.9	4
Image Perturbation Quality Assessment	ImageNet Human Evaluation	Ambiguity Cases2.5	3
Targeted Failure Generation	ImageNet	Misclassification Rate1	3
Targeted Failure Generation	CelebA	Misclass Rate100	3
Object Detection	Driving	Runtime (sec)113	2
Targeted Failure Generation	Driving	Diversity T0.094	2

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord