Authors
Hyunsu Kim, Gayoung Lee, Yunjey Choi, Jin-Hwa Kim, Jun-Yan Zhu
NAVER AI Lab; SNU AIIS; CMU
Portals
Summary
Image blending is challenging for unaligned original and reference images. Existing 2D-based methods [50] struggle to synthesize realistic results due to the 3D object pose differences between foreground and background. In contrast, we propose a 3D-aware blending method that aligns and composes unaligned images without manual effort.
Abstract
Image blending aims to combine multiple images seamlessly. It remains challenging for existing 2D-based methods, especially when input images are misaligned due to differences in 3D camera poses and object shapes. To tackle these issues, we propose a 3D-aware blending method using generative Neural Radiance Fields (NeRF), including two key components: 3D-aware alignment and 3D-aware blending. For 3D-aware alignment, we first estimate the camera pose of the reference image with respect to generative NeRFs and then perform 3D local alignment for each part. To further leverage 3D information of the generative NeRF, we propose 3D-aware blending that directly blends images on the NeRF\'s latent representation space, rather than raw pixel space. Collectively, our method outperforms existing 2D baselines, as validated by extensive quantitative and qualitative evaluations with FFHQ and AFHQ-Cat.
Related Works
Image blending; 3D-aware generative models; 3D-aware image editing
Comparisons
Poisson Blending, Latent Composition, StyleGAN3, StyleMapGAN, SDEdit
Overview
We employ density-blending loss (Ldensity) in the volume density of 3D NeRF space, as well as the image-blending loss (Limage) in 2D image space. Green rays pass through the interior of the mask (m) and red rays pass through the exterior of the mask (1 ? m). Limage and Ldensity are used to optimize the latent code wedit to generate the well-blended image Iedit.