Some suggestions of things to try:
It makes me wonder what 22 bucks in a glass of COVID look like. It also matches pretty well with my experience with dreams, where written words and letters are always fake even though I "know" what they stand for.
More importantly, what does it produce when you ask it to draw the future (maybe in the style of the 70s)?
Okay, seriously, this is a great way to explore how "common sense" differs between humans and this AI and highlight the risks, visually and viscerally, of relying on a technology that is fundamentally alien to humanity. Images are innocuous, but what happens when you apply AI to other objectives?
[22 bucks in a glass of covid] gets back an error message that the request violated guidelines.
[the future in the style of the 70's] is probably too vague to end up being awesome, but trying it... guess I can't put it in a comment. I'll add a request section to the post.
In general all writing I've seen is bad. I think this is less likely to be about safety, and more that it's hard to learn language by looking at a lot of images. However, since DE2 is trained on text, it clearly knows a lot about language at some level -- I would expect there's plenty of data to put out coherent text. Instead it outputs nonsense, focusing on getting the fonts and the background right.
It's definitely possible to get a diffusion model to write the text from a prompt into an image. I made a model that does this late last year. (blogpost...
It would be very interesting to see how much it understand space, for instance by making it draw maps. Perhaps "A map of New York City, with Central Park highlighted"? (I'm not sure if this is specific enough, but I fear that adding too many details will push Dall-E to join together various images.)
Some suggestions that seem like they might make cool or interesting images:
Before the diamondoid zeppelins clustering in the sky completely blotted out the sun, one came low enough for the message on its flank to be seen: "Tlon, Uqbar, Stlalm Anit".
Maybe people failure is caused by whatever they tweaked to avoid 'generating realistic faces and known persons'?
What about:
Buffalos in Buffalo buffaloing buffalo
Fire dousing water in the style of the Mona Lisa
Two rubiks cubes solving each other
Chebyshev's inequality
"An animal looking curiously in the mirror, but the reflection is a different kind of animal; in digital style."
"A cat looking curiously in the mirror, but the reflection is a different kind of animal; in digital style."
"A cat looking curiously in the mirror, but the reflection is a dog; in digital style."
Curious to see how it handles modified-reflection and lack-of-specificity.
How about, "the words "hello world!" written on a piece of paper"? Or you could substitute "on a compute screen" instead of a piece of paper, or you could just leave out the writing medium entirely. I'm curious if it can handle simple words if asked specifically for them.
It's weird that it turns Reimman into Reynam, which seems like a mistake someone would make if you told them the prompt in person.
In general a lot of the words kind of seem like it understands phonetics? Like "Synngy" is kinda phonetically close to "singularity".
It's kinda like it's generating its own gibberish language that could fool you into thinking someone was talking about the subject of you weren't paying attention while they talked... Or something.
I put myself on the waiting list for DALL-E. Meanwhile, here are a few things I'd ask it to depict, mainly trying to stress it to see how much it can do with little:
A drawing of a muchness.
The inside of a neutron.
A Ticktockman saying "Repent!" to a Harlequin.
A self-portrait of Escher making a self-portrait.
Mornington Crescent. (A while back I tried this on wombo.art, and it did quite a good job, although not on the level of draughtsmanship of DALL-E.)
This is really awesome, I'm at a loss for words. Can you try some of these?
- Division by zero
- Aliens in the style of pixel art
- Earth after climate change
- A garden in the style of watercolor line art
i have some ideas
something to cheer me up
a cube entering the 4th dimension
a realistic painting of the world cut in half, with one half being heaven and the other hell
a unknown backrooms level
a thing no one has ever seen before
a large spider wreaking havoc on new york city in the style of vincent van gogh
a flying city in the clouds of jupiter
the end of the universe
a impossible object with impossible colors
a satellite made by aliens
the most random thing possible
the mandelbrot set
complex mathematics
a monster with the head of a siren painted on a cave wall
a ph...
Awesome !
I'm however surprised that nobody seems to have tried the prompt :
"Do androids dream of electric sheep ?"
Not even with DALL-E 1 ?!
P.S.: The picture for this article (also used for Dick's book) looked promising, but seems like it was a "mere" "weird" human that painted it ?
https://www.fondazionesinapsi.it/orione/ma-gli-androidi-sognano-pecore-elettriche/ (it)
P.P.S.: At least one journalist (or more likely, her editor) had the same (again, pretty obvious) idea for an article title about AI, but even though it mentions DALL-E (1), they didn't think o...
Is there any way of reverse engineering from these pictures what existing images were used to generate them? Would be interesting to see how much similarity there is.
Some suggestions for testing the limits of abstract, or spatial reasoning:
the back of the letter 'E'
the back of the letter e
back of the number 42
the back of the last letter in the alphabet
underside of flat earth
the view of earth from 1,000,000,000 km away
view of earth from a million miles away
a view of bacteria using a billion times magnification
This is really interesting... could I request "new AI system that can create realistic images"? I'm curious to see how it handles self-reference
It's a good thing there are embedded safety precautions to prevent celebrities being spanked but the resulting images would probably not be realistic or specific enough for those folks. This seems like a great way to generate original art for home display screens that reflects the owners personal visions.
Suggestion.
Clown fish getting resuscitated
Clown fish getting CPR
Maybe substitute 'Clown fish' for 'Nemo' to see if l Dall-E 2 r can detect the cultural reference of Nemo as an animated representation of a clown fish also will Dall-E 2 recognize CPR as resuscitated
Combining the style of two Artist might be interesting. Something like:
"A painting in the style of {Artist 1} and {Artist 2}" lets say Claude Monet and Piet Mondrian
Also I think "A Zebra in the style of Mark Rothko" could be funny (Or "A Zebra with stripes in the style of Mark Rothko").
can you try the following: "the full alphabet of the robigull language, with translation to english"?
I enjoy the concept of fine art containing out-of-place items.
"Still life with apples and sausages by Paul Cézanne" "The Hay Wain by John Constable with TIE fighters" "American Gothic by Grant Wood with a large eye in the window" "The Last Supper with a bar, featuring Robocop arm-wrestling Jesus" "Still life with apples, bread and TARDIS in the style of Van Gogh" "Apple iPhone advert in the style of Edvard Munch"
I've been wanting to access DALL-E 2 so that I can generate ideas such as these, and then do physical paintings as per the original artists. After all, what is original or derivative? 🤔
Here are my suggestions: An illustration of a floating white hand in the middle of a 8 ball with wings. A cube made out of M&M's in a digital art style. A hyperrealistic surrealistic photograph of a pear in a pear while in a mcdonalds in a light. (I wonder how this will go.) Last one, A tornado made out of fire enveloping a city.
2 robots engaged in an epic rap battle
and
A rabbit dressed like a Samurai in the style of a Japanese painting
Suggestions:
Thanks for sharing these images! Truly astounding stuff.
I have so many questions! Love the AI storytelling... the playing cards work if you incorporate the "dog ate my homework" thought into the depiction. I'm also thinking you are having way too much fun doing this... ;-0
Thanks for posting these.
It's odd that mentioning Dall-E by name in the prompt would be a content policy violation. Do you know if they've mentioned why?
If you're still taking suggestions:
A beautiful, detailed illustration by James Gurney of a steampunk cheetah robot stalking through the ruins of a post-singularity city. A painting of an ornate brass automaton shaped like a big cat. A 4K image of a robotic cheetah in a strange, high-tech landscape.
I think OpenAI mentioned that including the same information several times with different ph...
Suggestions:
I always thought that it's weird that AI struggles with text, just as in my dreams. Every time I open a book in a dream, it's jumbled and nonsensical and I can immediately tell I'm dreaming.
my requests:
-mars combined with earth
-saturn with adorable rhinos having a party on saturn's ring
-dogs and cats enjoying diamonds and golds falling out of the sky
-pencils dancing with erasers
-sad guinea pigs running around
-a deformed roll of toilet paper on top of a elephant
thanks!
Suggestion: "sangaku proving the Pythagoras' theorem". I wonder if it can do visual explanations.
I just want to see how far I can push it, can you try:
1)
A glossy black mega world maze underground surrounded by lava. White standing blackbucks guard the entrance gate. Large gold keys float throughout the maze. Around the maze is glossy black castles with many levels that wrap around. Ultra 4K realistic photography.
2)
Macro of a flask that is containing glowing blue liquid sitting on a small short white table, is spilling into a gold treasure chest sitting in a glossy white hall. A robot pirate wearing a red bandana with red ruble eyes has his hands in th...
How about <some prompt without specified style you already used> + ", drawn by state-of-the-art image-generating neural net"?
Ooh ooh - try a dangerous Cognitohazard Memetic SCP image that will make anyone who sees it immediately want to reshare it with everyone they know and obsessively discuss it
How does it fare with impossible things like a 'seven sided cube' or a 'circle with four sides' or a '12 dimensional tesseract' ?
Some suggestions:
Hi. Thanks for the great post! I have a question and a request. Which is the longest, more complex prompt that Dall-E 2 ha honoured in a reasonably complete way? For the request, since I am a fan of aviation movies, I'd be curious of what would it give in response to “poster of a a classical aviation movie”, or something like it. I expect that some of them would show human faces, but actual posters often featured planes dogfighting and the like.
Can't wait to get access. In the meantime would love to see:
I have a few requests.
-Ancient greek smartphone
-A soda can that looks like the Empire State
-Iron Man painted by Caravaggio
-Steampunk tetris
-A strawberry that looks like a piano
This is awesome! I feel weird asking you to plug prompts into the machine. I wonder how it does with logo design, something like “the logo for a new longtermist startup”? Not using for commercial purposes; just curious.
Also curious about some particular word play ala Marry Poppins: “a cat drawing the curtains”
Very interesting!
I'll add some suggestions:
Then, some D&D-related suggestions, because why not (I would be surprised enough if he recognized the correct creatures at all):
Can you try:
1)
A muscular Egyptian piranha plant in a white robe guarding the entrance to Mario pipe world 7. Behind him is a world of pipes, and the walls are made of tall pipes. Realistic 4K photo.
2)
A glossy black mega world maze underground surrounded by lava. White standing blackbucks guard the entrance gate. Large gold keys float throughout the maze. Around the maze is glossy black castles with many levels that wrap around. Ultra 4K realistic photography.
3)
High definition surrealist artwork of blue aliens with tall legs and bodies and big heads standin...
Woah that's so cool! I've been messing with VQGAN recently and was wondering if the prompts would affect DALLE-2 the same way it affects VQGAN since they both CLIP to select the best ouputs. Here are some prompts I would love to see:
Watercolor painting of apples with arms and legs fighting (or dancing if it violates the policy)
A painting of a cat vampire drinking wine by greg rutkowski
Thanks for sharing! Can I please request the following:
'An outback Australian landscape with T-rex dinosaurs being chased by ducklings'
'The Buddha attaining enlightenment with galaxies entering his mind'
'An AI using a laptop computer to watch YouTube'
'The Tesseract from the movie Interstellar, with inverted colours'
'The aftermath of Global nuclear war'
I'm so curious! Thanks a lot!
Could you try this one? It's dark and demanding:
Pikachu walking up the stairs of a crisp glossy black castle next to the sea at night. At the top of the stairs is a red carpet. A black regenerative liquid being is jumping out of a green metal Mario pipe at the end of the carpet. Ultra 4K definition.
Wow that is amazing! Could I request "giant peaceful butterfly with rainbow trail and sunglasses invades new york in the morning" I know it's oddly specific but I am interested in seeing how the ai will handle all of these little details. Thanks!!
I got access to Dall·E 2 yesterday. Here are some pretty pictures!
My goal was to try to understand what things DE2 could do well, and what things it had trouble understanding or generating. My general hypothesis is that it would do a better job with things that are easy to find on the internet (cute animals, digital scifi things, famous art) and less well with more abstract or more unusual things.
Here's how it works: you put in a description of a picture, and it thinks for ~20 seconds and then produces 10 photos that are variations on that description. The diversity varies quite a bit depending on the prompt.
Let's see some puppies!
One thing to be aware of when you see amazing pictures that DE2 generates, is that there is some cherry picking going on. It often takes a few prompts to find something awesome, so you might have looked at dozens of images or more.
Still, this is pretty great! Those are recognizably goldendoodle puppies, mostly in something approximating play position.
You can see that the proportions in the generated images are not quite right, and some of the detail is off if you look closely. For instance, the front legs are too long here, the face isn't quite right, and the ears are a bit weird.
Still, it's pretty amazing given that it generated this from scratch. Check out how realistic the grass looks. I also like that the background is blurred, though not quite in the way that a camera would do it -- the transition is too abrupt.
Ok but the point of this isn't that they have a great image generation transformer, though it's clearly that. The key thing is is its magical ability to actually follow instructions or descriptions of images. Particularly interesting is compositionality -- can it combine concepts to generate something it's never seen before? Answer: yes!
The concept of "kitten" is pretty simply, though note that a kitten can be rendered in a ton of ways, from line drawings to cute art to photorealistic. Pop art is more complicated: it's a celebration of everyday images, and one of the most commonly known versions is Warhol's collection of repeated images in a grid with neon colors that vary per cell. And it mostly gets those things right.
What about weird things? You can put in any input and it'll do something.
None of those are twitter worthy, but with some trial and error you can get things that are interesting.
"Digital style" is one of the suggestions for getting better images.
X in Y style is fun, that's a lot of the images you see out in the world. Weirdly it's pretty sensitive to exactly the order you put things in.
Back to puppies, you get pretty different results depending on the placement of "surrealistic" even though the rephrasings seem semantically identical or at least very similar.
One place where DE2 clearly falls down is in generating people. I generated an image for [four people playing poker in a dark room, with the table brightly lit by an ornate chandelier], and people didn't look human -- more like the typical GAN-style images where you can see the concept but the details are all wrong.
Update: image removed because the guidelines specifically call out not sharing realistic human faces.
Anything involving people, small defined objects, and so on, looks much more like the previous systems in this area. You can tell that it has all the concepts, but can't translate them into something realistic.
This could be deliberate, for safety reasons -- realistic images of people are much more open to abuse than other things. Porn, deep fakes, violence, and so on are much more worrisome with people. They also mentioned that they scrubbed out lots of bad stuff from the training data; possibly one way they did that was removing most images with people.
Things look much better with animals, and better again with an artistic style.
The cards aren't right. Dice seem to be a lot easier.
People can also be pretty good if you don't see faces, though the hands are definitely not right.
Stlalm Anit is my new slogan.
In general all writing I've seen is bad. I think this is less likely to be about safety, and more that it's hard to learn language by looking at a lot of images. However, since DE2 is trained on text, it clearly knows a lot about language at some level -- I would expect there's plenty of data to put out coherent text. Instead it outputs nonsense, focusing on getting the fonts and the background right.
I definitely see serifs! I do not see sense.
Overall this is more powerful, flexible, and accurate than the previous best systems. It still is easy to find holes in it, with with some patience and willingness to iterate, you can make some amazing images.
In conclusion, generating a lot of images from a new state-of-the-art image generation system is fun, thanks for reading. If there's interest, I can also explore in-painting and Here are a few more gratuitous pics!
Reader requests:
Is that more or less cool than the actual statue they built in Miami?
The concept of beauty, according to DE2, is mostly women putting on makeup, which I can't post due to restrictions on posting faces. These are really realistic, capturing ethnicity and expressing emotion, totally unlike the poker players from earlier. But there's this one pastoral scene, which is nice.
This last one I edited out some floating writing on the left, and asked it to generate [a girl in a beautiful serene forest]. This one was also nice:
Seems kind of like generic anime and not so much Finnegan's Wake.
What are those penguins on the bottom left doing?!?
This series suggests that DE2 gets reflections pretty well, but either doesn't understand what it means to have something else be the reflection, or the prior for a reflection reflecting the thing looking in the mirror is too hard for it to override.
Here's one where I edited out the cat in the mirror and changed the prompt to be about a dog, and it did something sensible.
It got it right twice out of 10 tries, that's good right?
I tried to ask for Dall-E by name but that was a content policy violation.
It managed to get most of those elements in. Ultimately none of those is really satisfying though.
The good ones here had faces in them so I can't post them. I like how random this one is.
...is surprisingly calm and beautiful.
Boo!
A pen and some gibberish... is actually a pretty good metaphor for intellectual progress?
"A spaceship made of legos" is just more of the same.
It got the marching part. I guess DE2 hasn't ever played DnD.