Using GPT-3 To Generate Text Prompts for “AI” Generated Art

A beautiful room where a robot is standing with a paintbrush in front of a painted canvas. by John Berkey
A beautiful room where a robot is standing with a paintbrush in front of a painted canvas. by John Berkey

Lately I’ve been playing with Disco Diffusion, a tool that allows you to generate images based on textual input. It’s a really cool tool that can create some pretty interesting results.

Disco Diffusion takes a text input and then uses a neural network to generate an image based on that input. The results can be pretty abstract, but sometimes you can get some really cool results.

To get said cool results, you need good inputs and to tweak the settings a bit. The text inputs are called “prompts” and they can be anything from a single word to a full sentence. The settings allow you to control how the neural network generates the image, and there are a lot of options to play with.

If you’d like a well written tutorial to get started with Disco Diffusion, this is the first video I’ve seen on it, which is very useful to give you a feel of it Creating Videos with Diffusion Models (Disco Diffusion 4.1). Also this is an excellent beginner-friendly tutorial to understand it at a deeper level Zippy’s Disco Diffusion Cheatsheet v0.3 – don’t be intimidated by the article length, it’s a very pleasant read and not dry at all.

Anyway, as mentioned before, to get cool results you need to experiment with different settings and prompts, and typical images take 15-20 minutes to generate (at least with my settings).

After a while, due to work, it can get a bit exhausting because you want to keep on generating imgaes, but have to switch context from what you’re working on.

To take a bit of the load off, we can use GPT-3 or something similar (like Jurassic-1 from A21 Studio) to give it a few examples of other prompts that have given good results.

[powerkit_alert type=”info” dismissible=”false” multiline=”false”]
A great tutorial that gave me some starting intuition for GPT-3 is How I used GPT-3 to hit Hacker News front page 5 times in 3 weeks.
[/powerkit_alert]

In this article I’ll describe a few ways I use GPT-3 to help me easily generate prompts for Disco Diffusion and similar software, to more easily experiment.

I don’t have an optimal solution for this, but I can just share the prompts for GPT-3 that I’m using. I didn’t try to perfect these prompts too much, and mostly use them for inspiration. Sometimes GPT-3 generates some interesting prompts that I wouldn’t have thought about and I may just use just part of them. I post some results on Twitter @createdbyAI and IG @createdbyAI.art.

tl;dr AI News Section Preview
tl;dr AI News Section Preview

Sidenote: AI art tools are developing so fast it’s hard to keep up.

We set up a newsletter called tl;dr AI News.

In this newsletter we distill the information that’s most valuable to you into a quick read to save you time. We cover the latest news and tutorials in the AI art world on a daily basis, so that you can stay up-to-date with the latest developments.

Check tl;dr AI News

GPT-3 Prompt Insert Mode Examples

Q: What is CLIP+Diffusion AI Art Generation?
A: CLIP+Diffusion AI Art Generation is a method of creating art using artificial intelligence. This method involves using a neural network to generate images, and then using a diffusion process to create variations on those images.

Q: How should you write the text prompt for a art generation software to get ideal output for character art?
A: You should write a descriptive prompt that the computer can “understand”. A good way to look at it is like Alt (Alternative) text in HTML. It’s quite literal, descriptive, and to the point, so the machine doesn’t have to do so much guess work.

Q: Write a text prompt for a AI art generation software that would fit the art style of Kilian Eng
A: A scene of a [insert] by Kilian Eng

Although Complete Mode is the default GPT-3 mode, and it feels like it should be the first example, this is the main prompt I use because I feel it gives me a little bit more control to influence the resulting prompt.

My thought process here was that I wanted GPT-3 to have some context into what I want from a prompt for text-to-image generation.

I asked the questions, and let it answer it, since it’s more knowledgeable and articulate than I am, then tweaked the answers a little, like when I would with a human being somewhat.

I did this in Complete Mode available in GPT-3. Then I switched to Insert Mode.

Finally I asked it to generate a prompt in the style of a well known artist. This is because the artist in the prompt can heavily influence the result.

Other settings for GPT-3 are Temperature: 1 and newline () as a stop sequence, so it doesn’t go off to write excessively.

I’m using Insert Mode because I want it to generate scenarios based on the art style of well known artists in this case. This is because I’m hoping to give Disco Diffusion prompts with scenarios that play to the artist’s strength. This is not necessarily a great idea, it’s just something I’m trying out. Also, GPT-3 doesn’t always give out scenarios fitting the artist’s style.

Here are some results:

Q: Write a text prompt for a AI art generation software that would fit the art style of Kilian Eng
A: A scene of a cyberpunk city, with bright lights and sharp geometric shapes. Think Blade Runner meets Tron. by Kilian Eng
A scene of a cyberpunk city, with bright lights and sharp geometric shapes. Think Blade Runner meets Tron. by Kilian Eng
A scene of a cyberpunk city, with bright lights and sharp geometric shapes. Think Blade Runner meets Tron. by Kilian Eng
Q: Write a text prompt for a AI art generation software that would fit the art style of Simon Stålenhag
A: When the last city fell, only the temple remained. A post-apocalyptic world by Simon Stålenhag.
When the last city fell, only the temple remained. A post-apocalyptic world by Simon Stålenhag.
When the last city fell, only the temple remained. A post-apocalyptic world by Simon Stålenhag.
Q: Write a text prompt for a AI art generation software that would fit the art style of John Blanche
A: A scene of a dark, dank swamp, with twisted trees and eerie glowing eyes staring from the shadows. by John Blanche
A scene of a dark, dank swamp, with twisted trees and eerie glowing eyes staring from the shadows. by John Blanche
A scene of a dark, dank swamp, with twisted trees and eerie glowing eyes staring from the shadows. by John Blanche
Q: Write a text prompt for a AI art generation software that would fit the art style of Thomas Kinkade
A: A house by a lake at sunset, with light shining through the windows and smoke coming from the chimney. by Thomas Kinkade

While generating images for this article, I accidentally gave Disco Diffusion the question as the prompt, instead of the answer (Write a text prompt for a AI art generation software that would fit the art style of Thomas Kinkade).

Write a text prompt for a AI art generation software that would fit the art style of Thomas Kinkade
Write a text prompt for a AI art generation software that would fit the art style of Thomas Kinkade

This is the output image for the text I initially wanted to use. It’s a bit underwhelming and the other one is much better. It seems Disco Diffusion had a slightly different interpretation than we expected this time.

A house by a lake at sunset, with light shining through the windows and smoke coming from the chimney. by Thomas Kinkade
A house by a lake at sunset, with light shining through the windows and smoke coming from the chimney. by Thomas Kinkade
Q: Write a fantasy text prompt for a AI art generation software that would fit the art style of Michael Whelan
A: A terrifyingly huge dragon with multiple horns, spikes running down its back, and a long tail sweeps across the sky, blotting out the sun. Underneath, a lone figure stands defiantly, weapon drawn, preparing to face the creature. by Michael Whelan
A terrifyingly huge dragon with multiple horns, spikes running down its back, and a long tail sweeps across the sky, blotting out the sun. Underneath, a lone figure stands defiantly, weapon drawn, preparing to face the creature. by Michael Whelan
A terrifyingly huge dragon with multiple horns, spikes running down its back, and a long tail sweeps across the sky, blotting out the sun. Underneath, a lone figure stands defiantly, weapon drawn, preparing to face the creature. by Michael Whelan
Q: Write a fantasy text prompt for a AI art generation software that would fit the art style of Tyler Edlin
A: A magical kingdom where the sun never sets and the moon is always full. The trees are alive and the air is thick with magic. by Tyler Edlin
A magical kingdom where the sun never sets and the moon is always full. The trees are alive and the air is thick with magic. by Tyler Edlin
A magical kingdom where the sun never sets and the moon is always full. The trees are alive and the air is thick with magic. by Tyler Edlin

GPT-3 Prompt Complete Mode Examples

If you’ve ever used GPT-3 you probably guessed how to use this mode. An important setting for me is to have a stop sequence, which is mostly a newline .

GPT-3 Question/Answer Prompt

In the following example I’m using a similar type of prompt as with the example above. I tried to guide GPT-3 to the type of prompt I’m looking for, and then let it do its thing.

GPT 3 Complete Prompt

The result is quite underwhelming, but it happens.

A cyberpunk cityscape with bright neon lights and flying cars. The city is teeming with life, and the skies are filled with air traffic.
A cyberpunk cityscape with bright neon lights and flying cars. The city is teeming with life, and the skies are filled with air traffic.

Sometimes I guide it a bit more in the direction I’d like. In this example I gave it A cyberpunk and when it finished too soon I gave it The city is and let it complete the rest.

GPT-3 Prompt 3

Result:

A cyberpunk cityscape with bright neon lights and flying cars. The city is teeming with life, and the sies are filled with air traffic.
A cyberpunk cityscape with bright neon lights and flying cars. The city is teeming with life, and the sies are filled with air traffic.

GPT-3 “Previous Successful Prompts” Prompt

Another way is to give it a few examples of previously successful prompts, in hopes that it will generate something in the same style, and you can guide it in the direction you want.

Sometimes I guide it a little by starting with something like just A.

GPT 3 Complete Mode Example 2
Credit to QmediaFX for the prompts.

Result:

A forgotten ruin in an overgrown jungle, inspired by H.P. Lovecraft's At the Mountains of Madness
A forgotten ruin in an overgrown jungle, inspired by H.P. Lovecraft’s At the Mountains of Madness

Here’s the same thing but I added artists as modifiers, scaled the image to 1024×1024 and increased steps to 500.

The prompt used is “A forgotten ruin in an overgrown jungle, inspired by H.P. Lovecraft’s At the Mountains of Madness by Li Shuxing and Tyler Edlin

Mentioning well known artists can dramatically improve your output image. Increasing steps often makes images more detailed.

A forgotten ruin in an overgrown jungle, inspired by H.P. Lovecraft's At the Mountains of Madness
A forgotten ruin in an overgrown jungle, inspired by H.P. Lovecraft’s At the Mountains of Madness by Li Shuxing and Tyler Edlin
You can find some great examples of prompts on Twitter searching for #aiart or something like that, or on /r/Disco Diffusion. Some people share their prompts and results.

Disco Diffusion Parameters Used in This Article

In case someone is using Disco Diffusion and isn’t getting similar results as me in some cases. I’m using:

eta: 0.9
steps: 250
clip_guidance: 15000
sat_scale: 2000
range_scale: 100
cut_ic_pow: 10
Models: ViTB32, ViTB16, RN101, RN50x4

Conclusion

Some of the results presented above are underwhelming. Others I’ve had in the past have been great. The main reason I use GPT-3 is for parts of the prompts in generates, because sometimes it generates things I wouldn’t have thought about.

That’s about it. Hope this comes in useful to someone to use GPT-3 to generate text prompts for generating images with machine learning software. Feel free to leave a commend if you have any feedback/questions/issues, and I’ll get back to you as soon as possible.

43 Shares:
Subscribe
Notify of
guest
Receive notifications when your comment receives a reply. (Optional)
Your username will link to your website. (Optional)

3 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Ziwang
1 year ago

A great article. This is very helpful. I will add this link to my personal page for future reference.

Ziwang
Ziwang
1 year ago
Reply to  EdXD

Many AI Art Generators are based on CLIP models, and the CLIP model is based on GPT-3 model. So these models can more easily understand the text prompts generated by GPT-3. So using GPT-3 to generate text prompt is a good shortcut.Thanks

You May Also Like