pix2pix: Image-to-Image Translation with Conditional Adversarial Networks

Authors

Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros

Berkeley AI Research (BAIR) Laboratory, UC Berkeley

Portals

Summary

Many problems in image processing, graphics, and vision involve translating an input image into a corresponding output image. These problems are often treated with application-specific algorithms, even though the setting is always the same: map pixels to pixels. Conditional adversarial nets are a general-purpose solution that appears to work well on a wide variety of these problems.

Abstract

We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.

Contribution

Our primary contribution is to demonstrate that on a wide variety of problems, conditional GANs produce reasonable results
Our second contribution is to present a simple framework sufficient to achieve good results, and to analyze the effects of several important architectural choices

Related Works

Structured losses for image modeling; Conditional GANs

Overview

Training a conditional GAN to map edges?photo. The discriminator, D, learns to classify between fake (synthesized by the generator) and real {edge, photo} tuples. The generator, G, learns to fool the discriminator. Unlike an unconditional GAN, both the generator and discriminator observe the input edge map.

PDF Preview

1611.07004

pix2pix: Image-to-Image Translation with Conditional Adversarial Networks

pix2pix: Image-to-Image Translation with Conditional Adversarial Networks

Authors

Portals

Summary

Abstract

Contribution

Related Works

Overview

PDF Preview

Like this:

Leave a Reply Cancel reply

pix2pix: Image-to-Image Translation with Conditional Adversarial Networks

pix2pix: Image-to-Image Translation with Conditional Adversarial Networks

Authors

Portals

Summary

Abstract

Contribution

Related Works

Overview

PDF Preview

Like this:

You may also Like:

U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation

BBDM: Image-to-image Translation with Brownian Bridge Diffusion Models

PITI: Pretraining is All You Need for Image-to-Image Translation

Leave a Reply Cancel reply