Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

HyperNet-Adaptation for Diffusion-Based Test Case Generation

About

The increasing deployment of deep learning systems requires systematic evaluation of their reliability in real-world scenarios. Traditional gradient-based adversarial attacks introduce small perturbations that rarely correspond to realistic failures and mainly assess robustness rather than functional behavior. Generative test generation methods offer an alternative but are often limited to simple datasets or constrained input domains. Although diffusion models enable high-fidelity image synthesis, their computational cost and limited controllability restrict their applicability to large-scale testing. We present HyNeA, a generative testing method that enables direct and efficient control over diffusion-based generation. HyNeA provides dataset-free controllability through hypernetworks, allowing targeted manipulation of the generative process without relying on architecture-specific conditioning mechanisms or dataset-driven adaptations such as fine-tuning. HyNeA employs a distinct training strategy that supports instance-level tuning to identify failure-inducing test cases without requiring datasets that explicitly contain examples of similar failures. This approach enables the targeted generation of realistic failure cases at substantially lower computational cost than search-based methods. Experimental results show that HyNeA improves controllability and test diversity compared to existing generative test generators and generalizes to domains where failure-labeled training data is unavailable.

Oliver Wei{\ss}l, Vincenzo Riccio, Severin Kacianka, Andrea Stocco• 2026

Related benchmarks

TaskDatasetResultRank
Binary ClassificationImageNet
Runtime (s)94.41
4
Binary ClassificationCelebA
Runtime (sec)220.9
4
Image Perturbation Quality AssessmentImageNet Human Evaluation
Ambiguity Cases2.5
3
Targeted Failure GenerationImageNet
Misclassification Rate1
3
Targeted Failure GenerationCelebA
Misclass Rate100
3
Object DetectionDriving
Runtime (sec)113
2
Targeted Failure GenerationDriving
Diversity T0.094
2
Showing 7 of 7 rows

Other info

Follow for update