StyleGAN2: Analyzing and Improving the Image Quality of StyleGAN

Authors

Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, Timo Aila

NVIDIA; Aalto University

Portals

Summary

We expose and analyze several of its characteristic artifacts, and propose changes in both model architecture and training methods to address them. In particular, we redesign the generator normalization, revisit progressive growing, and regularize the generator to encourage good conditioning in the mapping from latent codes to images.

Abstract

The style-based GAN architecture (StyleGAN) yields state-of-the-art results in data-driven unconditional generative image modeling. We expose and analyze several of its characteristic artifacts, and propose changes in both model architecture and training methods to address them. In particular, we redesign the generator normalization, revisit progressive growing, and regularize the generator to encourage good conditioning in the mapping from latent codes to images. In addition to improving image quality, this path length regularizer yields the additional benefit that the generator becomes significantly easier to invert. This makes it possible to reliably attribute a generated image to a particular network. We furthermore visualize how well the generator utilizes its output resolution, and identify a capacity problem, motivating us to train larger models for additional quality improvements. Overall, our improved model redefines the state of the art in unconditional image modeling, both in terms of existing distribution quality metrics as well as perceived image quality.

Overview

We redesign the architecture of the StyleGAN synthesis network. (a) The original StyleGAN, where A denotes a learned affine transform from W that produces a style and B is a noise broadcast operation. (b) The same diagram with full detail. Here we have broken the AdaIN to explicit normalization followed by modulation, both operating on the mean and standard deviation per feature map. We have also annotated the learned weights (w), biases (b), and constant input (c), and redrawn the gray boxes so that one style is active per box. The activation function (leaky ReLU) is always applied right after adding the bias. (c) We make several changes to the original architecture that are justified in the main text. We remove some redundant operations at the beginning, move the addition of b and B to be outside active area of a style, and adjust only the standard deviation per feature map. (d) The revised architecture enables us to replace instance normalization with a “demodulation” operation, which we apply to the weights associated with each convolution layer.

PDF Preview

1912.04958

StyleGAN2: Analyzing and Improving the Image Quality of StyleGAN

StyleGAN2: Analyzing and Improving the Image Quality of StyleGAN

Authors

Portals

Summary

Abstract

Overview

PDF Preview

Like this:

Leave a Reply Cancel reply

StyleGAN2: Analyzing and Improving the Image Quality of StyleGAN

StyleGAN2: Analyzing and Improving the Image Quality of StyleGAN

Authors

Portals

Summary

Abstract

Overview

PDF Preview

Like this:

You may also Like:

NeRF-Art: Text-Driven Neural Radiance Fields Stylization

TileGen: Tileable, Controllable Material Generation and Capture

MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation

Leave a Reply Cancel reply