Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Fast & Efficient Normalizing Flows and Applications of Image Generative Models

About

This thesis presents novel contributions in two primary areas: advancing the efficiency of generative models, particularly normalizing flows, and applying generative models to solve real-world computer vision challenges. The first part introduce significant improvements to normalizing flow architectures through six key innovations: 1) Development of invertible 3x3 Convolution layers with mathematically proven necessary and sufficient conditions for invertibility, (2) introduction of a more efficient Quad-coupling layer, 3) Design of a fast and efficient parallel inversion algorithm for kxk convolutional layers, 4) Fast & efficient backpropagation algorithm for inverse of convolution, 5) Using inverse of convolution, in Inverse-Flow, for the forward pass and training it using proposed backpropagation algorithm, and 6) Affine-StableSR, a compact and efficient super-resolution model that leverages pre-trained weights and Normalizing Flow layers to reduce parameter count while maintaining performance. The second part: 1) An automated quality assessment system for agricultural produce using Conditional GANs to address class imbalance, data scarcity and annotation challenges, achieving good accuracy in seed purity testing; 2) An unsupervised geological mapping framework utilizing stacked autoencoders for dimensionality reduction, showing improved feature extraction compared to conventional methods; 3) We proposed a privacy preserving method for autonomous driving datasets using on face detection and image inpainting; 4) Utilizing Stable Diffusion based image inpainting for replacing the detected face and license plate to advancing privacy-preserving techniques and ethical considerations in the field.; and 5) An adapted diffusion model for art restoration that effectively handles multiple types of degradation through unified fine-tuning.

Sandeep Nagar• 2025

Related benchmarks

TaskDatasetResultRank
Density EstimationCIFAR-10 (test)
Bits/dim3.3471
134
Density EstimationImageNet 32x32 (test)
Bits per Sub-pixel4.014
66
Density EstimationImageNet 64x64 (test)
Bits Per Sub-Pixel3.8514
62
Image GenerationMNIST (test)--
13
Face DetectionPvt-IDD real faces (test)--
8
Face DetectionPvt-IDD anonymized faces (test)--
8
Density EstimationGalaxy (test)
Bits-per-Dimension2.2591
3
Object DetectionMissing Traffic Signs Video Dataset (MTSVD)
mAP90
3
Scene CategorizationMissing Traffic Signs Video Dataset (MTSVD)
Top-1 Accuracy60.5
3
ClusteringLandsat 8 (ground truth data (30 rock samples))--
3
Showing 10 of 12 rows

Other info

Follow for update