Stable Diffusion 2406.18790 MUMU: BOOTSTRAPPING MULTIMODAL IMAGE GENERATION FROM TEXT-TO-IMAGE DATA B) Related C) References