This is a linkpost for https://openai.com/blog/dall-e/
My own take: Cool, not super surprising given GPT-3 and Image GPT. I look forward to seeing what a bigger version of this would do, so that we could get a sense of how much it improves with scale. I'm especially interested in the raven's progressive matrices performance.
I wonder if something like this could be pared with AI Dungeon? If they do release a image generator model for public or private use I think it would be fun to see an image accompany the last line(s) of the text output that has been generated for the story thus far.
Then more complex AI generated games wouldn't be too far away either.
Taking a sentence output by AI Dungeon and feeding it into DALL-E is totally possible (if and when the DALL-E source code becomes available). I'm not sure how much money it would cost. DALL-E has about 7% of the parameters that the biggest model of GPT-3 has, though I doubt AI Dungeon uses the biggest model. Generating an entire image with DALL-E means predicting 1024 tokens/codewords, whereas predicting text is at most 1 token per letter. All in all, it seems financially plausible. I think it would be fun to see the results too.
What seems tricky to m... (read more)