A Bayesian Perspective on the Deep Image Prior

Zezhou Cheng, Matheus Gadelha, Subhransu Maji, Daniel Sheldon

University of Massachusetts - Amherst

The deep image prior was recently introduced as a prior for natural images. It represents images as the output of a convolutional network with random inputs. For inference, gradient descent is performed to adjust network parameters to make the output match observations. This approach yields good performance on a range of image reconstruction tasks. We show that the deep image prior is asymptotically equivalent to a stationary Gaussian process prior in the limit as the number of channels in each layer of the network goes to infinity, and derive the corresponding kernel. This informs a Bayesian approach to inference. We show that by conducting posterior inference using stochastic gradient Langevin we avoid the need for early stopping, which is a drawback of the current approach, and improve results for denoising and impainting tasks. We illustrate these intuitions on a number of 1D and 2D signal reconstruction tasks.

Publication

A Bayesian Perspective on the Deep Image Prior
Zezhou Cheng, Matheus Gadelha, Subhransu Maji, Daniel Sheldon
Computer Vision and Pattern Recognition (CVPR), 2019
arXiv, pdf, supplementary, poster, bibtex

Code

GitHub Link

Main Discovery

1. Deep Image Prior (DIP) is asymptotically equivalent to a stationary Gaussian Process (GP) prior

We derive the analytical form of the GP kernel and analyze the effect of convolution, upsampling, downsampling and skip connections in the resulting GP kernel.

Priors and posterior with 1D convolutional networks for AutoEncoder and Conv architectures.

The samples drawn from the DIP and GP prior with equivalent stationary kernel are shown below.

The posterior mean estimated by SGD with the DIP matches the GP posterior mean as the number of channels in the network increases. However posterior inference with long-tail GP kernels is slow for large images compared to SGD inference of the DIP.

Inpainting with a Gaussian process and deep image prior

2. SGLD: a Bayesian inference method for deep image prior

Inference with SGD requires early stopping since the MSE with respect to the input eventually goes to zero, thus overfitting to the noise. SGLD on the other hand does not and posterior samples provide a notion of uncertainty.

Denoising and inpainting results with the deep image prior

SGLD performs better than vanilla gradient descent on image denoising and inpainting tasks. The PSNR for various images are shown below. See the paper for details.

Input, SGD, and SGLD.

	House	Peppers	Lena	Baboon	F16	Kodak1	Kodak2	Kodak3	Kodak12	Avg.
SGD	26.74	28.42	29.17	23.50	29.76	26.61	28.68	30.07	29.78	28.08
SGLD	30.86	30.82	32.05	24.54	32.90	27.96	32.05	33.29	32.79	30.81

Image denoising task.

Input, SGD (19.23 dB), and SGLD (21.86 dB).

	Barb.	Boat	House	Lena	Peppers	C.man	Couple	Finger	Hill	Man	Mont.	Avg.
SGD	28.48	31.54	35.34	35.00	30.40	27.05	30.55	32.24	31.37	31.32	30.21	28.08
SGLD	33.82	34.26	40.13	37.73	33.97	30.33	33.72	33.41	34.03	33.54	34.65	34.51

Image inpainting task.

Acknowledgements

This research was supported in part by NSF grants #1749833, #1749854, and #1661259, and the MassTech Collaborative for funding the UMass GPU cluster.