With the success of diffusion models in image generation, I was wondering if doing the same but with text embeddings would make sense.

Diffusing the embeddings so they end up being a bit off in term of vectors and position and learning to correct them. Also the iterative process of refining them during multiple pass.

Would that make any sense? I don't think I heard about research in this area.

Comments

You must log in or register to comment.

fastglow t1_itmotdk wrote on October 24, 2022 at 7:46 PM

#206,865

It's been applied to text-to-speech: https://arxiv.org/abs/2104.01409

Loffel t1_itn2var wrote on October 24, 2022 at 9:16 PM

#208,341

Check out Diffusion LM: https://arxiv.org/abs/2205.14217

hapliniste OP t1_itn5i13 wrote on October 24, 2022 at 9:33 PM

#208,599

Replying to Loffel (#208,341)

This seems to be exactly what I had in mind 👍🏻

limpbizkit4prez t1_itnoeqq wrote on October 24, 2022 at 11:53 PM

#210,663

I've always applied an annealing schedule like that to LMs. Imo, it works incredibly well and generalizes great

axm92 t1_ito7hmf wrote on October 25, 2022 at 2:12 AM

#212,493

In case you’re interested, I have a minimal implementation here: https://github.com/madaan/minimal-text-diffusion

ghosthamlet t1_ito9l2t wrote on October 25, 2022 at 2:27 AM

#212,669

DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models: https://arxiv.org/abs/2210.08933

Awesome-Diffusion-Models for Natural Language Generation: https://github.com/heejkoo/Awesome-Diffusion-Models#natural-language

graphicteadatasci t1_itp0p1l wrote on October 25, 2022 at 7:05 AM

#215,099

This repo is what you want: https://github.com/YangLing0818/Diffusion-Models-Papers-Survey-Taxonomy