Will latent diffusion improve on DALL-E2 text conditioning? #42

lucidrains · 2022-05-01T17:45:52Z

lucidrains
May 1, 2022
Maintainer

Is working in latent space better?

Yet another potential paper up for grabs :)

lucidrains · 2022-05-02T19:20:38Z

the other idea would be to try a CLIP that has fine-grained interactions between text and image tokens https://arxiv.org/abs/2111.07783

already offered at https://github.com/lucidrains/x-clip with the setting use_all_token_embeds

1 reply

🤔 need to make sure the code in the repository is compatible with that setting actually..