Comment Permalink

I think this'd be better with a description of what it is and how it's relevant. (Linkposts generally benefit from that, and in this case it almost looks like spam if you're not paying attention)

9 Make-A-Video by Meta AI

by P.

29th Sep 2022

1 min read

9

This is a linkpost for https://makeavideo.studio/

Meta AI (Facebook) created a text-to-video model by taking a diffusion text-to-image model, adding temporal convolutional and attention layers, and fine-tuning it with video data (without text). They also use spatial and temporal super-resolution networks. Showing, to the surprise of no one who was paying attention, that our existing mostly homogeneous architectures can be easily extended to understand, to some extent, the structure of everyday reality. It's not the first text-to-video model, but it's much better than what came before.

Personal Blog

9

Make-A-Video by Meta AI

4P.

2Raemon

2P.

New Comment

4 comments, sorted by

top scoring

Click to highlight new comments since: Today at 12:13 AM

[-]P.3y41

Emad from Stability AI (the people behind Stable Diffusion) says that they will make a model better than this.

[-]Raemon3y20

I think this'd be better with a description of what it is and how it's relevant. (Linkposts generally benefit from that, and in this case it almost looks like spam if you're not paying attention)

[-]P.3y21

And here we have another one: https://phenaki.video/

[-]P.3y20

And a 3D one by optimizing a differentiable volumetric representation using 2D diffusion: https://dreamfusionpaper.github.io/

Moderation Log