Fast & Efficient Normalizing Flows and Applications of Image Generative Models

About

This thesis presents novel contributions in two primary areas: advancing the efficiency of generative models, particularly normalizing flows, and applying generative models to solve real-world computer vision challenges. The first part introduce significant improvements to normalizing flow architectures through six key innovations: 1) Development of invertible 3x3 Convolution layers with mathematically proven necessary and sufficient conditions for invertibility, (2) introduction of a more efficient Quad-coupling layer, 3) Design of a fast and efficient parallel inversion algorithm for kxk convolutional layers, 4) Fast & efficient backpropagation algorithm for inverse of convolution, 5) Using inverse of convolution, in Inverse-Flow, for the forward pass and training it using proposed backpropagation algorithm, and 6) Affine-StableSR, a compact and efficient super-resolution model that leverages pre-trained weights and Normalizing Flow layers to reduce parameter count while maintaining performance. The second part: 1) An automated quality assessment system for agricultural produce using Conditional GANs to address class imbalance, data scarcity and annotation challenges, achieving good accuracy in seed purity testing; 2) An unsupervised geological mapping framework utilizing stacked autoencoders for dimensionality reduction, showing improved feature extraction compared to conventional methods; 3) We proposed a privacy preserving method for autonomous driving datasets using on face detection and image inpainting; 4) Utilizing Stable Diffusion based image inpainting for replacing the detected face and license plate to advancing privacy-preserving techniques and ethical considerations in the field.; and 5) An adapted diffusion model for art restoration that effectively handles multiple types of degradation through unified fine-tuning.

Sandeep Nagar• 2025

Related benchmarks

Task	Dataset	Result
Density Estimation	CIFAR-10 (test)	Bits/dim3.3471	134
Density Estimation	ImageNet 64x64 (test)	Bits Per Sub-Pixel3.8514	71
Density Estimation	ImageNet 32x32 (test)	Bits per Sub-pixel4.014	69
Image Generation	MNIST (test)	--	17
Face Detection	Pvt-IDD real faces (test)	--	8
Face Detection	Pvt-IDD anonymized faces (test)	--	8
Density Estimation	Galaxy (test)	Bits-per-Dimension2.2591	3
Object Detection	Missing Traffic Signs Video Dataset (MTSVD)	mAP90	3
Scene Categorization	Missing Traffic Signs Video Dataset (MTSVD)	Top-1 Accuracy60.5	3
Clustering	Landsat 8 (ground truth data (30 rock samples))	--	3

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord