MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation

Authors

Yuheng Li, Krishna Kumar Singh, Utkarsh Ojha, Yong Jae Lee

University of California, Davis

Portals

Abstract

We present MixNMatch, a conditional generative model that learns to disentangle and encode background, object pose, shape, and texture from real images with minimal supervision, for mix-and-match image generation. We build upon FineGAN, an unconditional generative model, to learn the desired disentanglement and image generator, and leverage adversarial joint image-code distribution matching to learn the latent factor encoders. MixNMatch requires bounding boxes during training to model background, but requires no other supervision. Through extensive experiments, we demonstrate MixNMatch's ability to accurately disentangle, encode, and combine multiple factors for mix-and-match image generation, including sketch2color, cartoon2img, and img2gif applications. Our code/models/demo can be found at https://github.com/Yuheng-Li/MixNMatch

Contribution

We introduce MixNMatch, a conditional generative model that learns to disentangle and encode background, object pose, shape, and texture factors from real images with minimal human supervision. This gives MixNMatch fine-grained control in image generation, where each factor can be uniquely controlled. MixNMatch can take as input either real reference images, sampled latent codes, or a mix of both
Through various qualitative and quantitative evaluations, we demonstrate MixNMatch’s ability to accurately disentangle, encode, and combine multiple factors for mix-and-match image generation. Furthermore, we show that MixNMatch’s learned disentangled representation leads to state-of-the-art fine-grained object category clustering results of real images
We demonstrate a number of interesting applications of MixNMatch including sketch2color, cartoon2img, and img2gif

Related Works

Conditional image generation; Disentangled representation learning

Comparisons

Simple-GAN, InfoGAN, LR-GAN, StackGANv2, FineGAN

PDF Preview

1911.11758

MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation

MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation

Authors

Portals

Abstract

Contribution

Related Works

Comparisons

PDF Preview

Like this:

Leave a Reply Cancel reply

MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation

MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation

Authors

Portals

Abstract

Contribution

Related Works

Comparisons

PDF Preview

Like this:

You may also Like:

NeRF-Art: Text-Driven Neural Radiance Fields Stylization

Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold

PBR-Net: Imitating Physically Based Rendering Using Deep Neural Network

Leave a Reply Cancel reply