Oral
Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models
Daniel Geng · Inbum Park · Andrew Owens
Summit Flex Hall AB Oral #3
We consider the problem of synthesizing multi-view optical illusions---images that change appearance upon a transformation, such as a flip. We present a conceptually simple, zero-shot method to do so based on diffusion. For every diffusion step we estimate the noise from different views of a noisy image, combine the noise estimates, and perform a step of the reverse diffusion process. A theoretical analysis shows that this method works precisely for views that can be written as orthogonal transformations, of which permutations are a subset. This leads to the idea of a visual anagram, which includes images that change appearance upon a rotation or a flip, but also upon more exotic pixel permutations such as a jigsaw rearrangement. We provide both qualitative and quantitative results demonstrating the effectiveness and flexibility of our method.