Stable Diffusion vs. DALL·E 2: Which image generator is better? [2023]

Stable Diffusion vs. DALL·E 2: Which image generator is better? [2023] https://ift.tt/fw04eyb

Stable Diffusion and DALL·E 2 are two of the best AI image generation models available right now—and they work in much the same way. Both models were trained on millions or billions of text-image pairs. This allows them to comprehend concepts like dogs, deerstalker hats, and dark moody lighting, and it's how they can understand what a prompt like "an impressionist oil painting of a Canadian man riding a moose through a forest of maple trees" is actually asking them.

In addition to being AI models, Stable Diffusion and DALL·E 2 both have apps that are capable of taking a text prompt and generating a series of matching images. So which of these apps should you use? Let's dive in.

How do Stable Diffusion and DALL·E 2 work?

For image generation, Stable Diffusion and DALL·E 2 both rely on a process called diffusion. The image generator starts with a random field of noise, and then edits it in a series of steps to match its interpretation of the prompt. By starting with a different set of random noise each time, they can create different results from the same prompt. It's kind of like looking up at a cloudy sky, finding a cloud that looks kind of like a dog, and then being able to snap your fingers to keep making it more and more dog-like.

A series of images generated from AI: dog-shaped cloud floating in a clear blue sky

A dog-shaped cloud floating in a clear blue sky—from top-left, going clockwise, at 10 steps, 20 steps, 40 steps, and 120 steps.

Even though both models have similar technical underpinnings, there are plenty of differences between them.

Stability AI (the makers of Stable Diffusion) and OpenAI (the makers of DALL·E 2) have different philosophical approaches to how these kinds of AI tools should work. They were also trained on different data sets, with different design and implementation decisions made along the way. So although you can use both to do the same thing, they can give you totally different results.

Here's the prompt I mentioned above in DreamStudio (Stable Diffusion):

Four images generated by DreamStudio based on the prompt above

And here it is in DALL·E 2:

Four images generated by DALL-E 2 based on the prompt above

Something else to keep in mind: DALL·E 2 is only available through OpenAI (or other services using its API). Stable Diffusion is actually a number of open source models. You can access it through Stability AI's DreamStudio app, but you can also download the latest version of Stable Diffusion, install it on your own computer, and even train it on your own data. (This is how many services like Lensa's AI avatars work.)

I'll dig in to what this all means a little later, but for ease of comparison, I'll mostly be comparing the models as they're accessed through their official web apps.

Stable Diffusion vs. DALL·E 2 at a glance

Stable Diffusion and DALL·E 2 are built using similar technologies, but they differ in a few important ways. Here's a short summary of things, but read on for the details.

	Stable Diffusion	DALL·E 2
Quality	⭐⭐⭐⭐⭐ Exceptional AI-generated images	⭐⭐⭐⭐⭐ Exceptional AI-generated images
Ease of use	⭐⭐⭐ Lots of options, but can get complicated	⭐⭐⭐⭐⭐ Type a prompt, click a button
Power and control	⭐⭐⭐⭐⭐ You still have to write a prompt, but you get a lot of control over the generative process	⭐⭐ Very limited options beyond in-painting and out-painting

Both make great AI-generated images

Let's get the big thing out of the way: both Stable Diffusion and DALL·E 2 are capable of producing incredible AI-generated images. I've had heaps of fun playing around with both models, and I've been shocked by how they've nailed certain prompts. I've also laughed quite hard at both their mess-ups. Really, neither model is objectively—or even subjectively—better than the other. At least not consistently.

If I was forced to highlight where the models can differ, I'd say that:

By default, Stable Diffusion tends towards more realistic images, while DALL·E 2 can be more abstract.
DALL·E 2 can sometimes produce better results from shorter prompts than Stable Diffusion does.

Though, again, the results you get really depend on what you ask for—and how much "prompt engineering" you're prepared to do.

Stable Diffusion

DALL·E 2

DALL·E 2 is easier to use

DALL·E 2 is incredibly simple to use. Type out a prompt, hit Generate, and you'll get four results. It's like a fun toy.

That's not to say you can't dive deeper with DALL·E 2. You're able to upload your own images to use as a prompt to create more variations, or use the editor to inpaint (replace bits of the image with AI-generated elements) or outpaint (expand the image with AI-generated elements). It's just that a lot of the nuts and bolts are hidden away.

Out of the box, Stable Diffusion is a little less user-friendly. Although you can type a prompt, hit Dream, and do all the same inpainting and outpainting, there are more options here that you can't help but wonder about.

For example: you can select a style (Enhance, Anime, Photographic, Digital Art, Comic Book, Fantasy Art, Analog Film, or Neon Punk). There are also two prompt boxes: one for regular prompts and another for negative prompts, the things you don't want to see in your images. And that's all before you consider the advanced options that allow you to set the prompt strength, the number of generation steps the model takes, what model is used, and even the seed it uses.

Of course, installing and training your own Stable Diffusion instance is an entirely different story.

Stable Diffusion is more powerful

For all its ease of use, DALL·E 2 doesn't give you a lot of options. You can generate images from a prompt, and…that's kind of it. If you don't like the results, you have to tweak the prompt and try again. Some of the other services that use DALL·E 2's API, like NightCafé, offer style options and an advanced prompt editor with suggested terms to use, but you're still just shaping the output with the text prompt.

Stable Diffusion (in every iteration) gives you more options and control. As I mentioned above, you can set the number of steps, the initial seed, and the prompt strength, and you can make a negative prompt—all within the DreamStudio web app.

And even in NightCafé, which also supports Stable Diffusion, you get more options than you do with DALL·E 2. In addition to being able to set styles and use the advanced prompt editor, you can control what seed is used and what sampling method is used by the algorithm, among other things.

Finally, if you want to build a generative AI that's custom-trained on specific data—such as your own face, logos, or anything else—you can do that with Stable Diffusion. This allows you to create an image generator that consistently produces a particular kind or style of image. The specifics of how you do this are far beyond the scope of this comparison, but the point is that this is something that Stable Diffusion is designed to do that isn't remotely possible with DALL·E 2.

Stable Diffusion wins on pricing

DALL·E 2's pricing is super simple. Each text prompt generates a set of four images and costs one credit. Credits cost $15 for 115, so that's ~$0.13/prompt or ~$0.0325/image. Each round of outpainting or inpainting also generates four options and costs one credit. (If you signed up to DALL·E 2 before April 6, 2023, there was a free trial, and you got 40 free credits every month. Unfortunately, that option is now gone.)

Stable Diffusion's pricing is a lot more complicated.

Let's assume you're accessing it through DreamStudio, not downloading Stable Diffusion and running it on your computer or accessing it through some other service that uses a custom-trained model. In that case, Stable Diffusion also uses a credits system, but it's nowhere near as neat as one credit, one prompt. Because you have so many options, the price changes with the size, number of steps, and number of images you want to generate. Say you want to generate four 512x512 pixel images with the latest model using 50 steps. That would cost 3.32 credits. If you wanted to use just 30 steps, it would only cost 2 credits. (You can always see the cost before you press Dream.)

You get 25 free credits when you sign up for Dream Studio, which is enough for ~30 images (or seven text prompts) with the default settings. After that, it costs $10 for 1,000 credits. That's enough for more than 1,000 images or ~300 text prompts at the default settings.

So, if you ignore all the confusion and focus on the number of default images you can generate per dollar, Stable Diffusion takes it. And it has a free trial.

Commercial use is complicated for both

If you're planning to use Stable Diffusion or DALL·E 2 for commercial use, things get a bit complicated.

Commercial use is currently allowed by both, but the implications haven't been fully explored. In a ruling in February 2023, the U.S. Copyright Office decided that images created by Midjourney, another generative AI, can't be copyrighted. This means that anyone may be able to freely take any image you create and use it to do whatever they want—though this hasn't really been tested.

Purely from a license standpoint, Stable Diffusion has a slight edge. Its model has fewer guardrails—and even less if you train one yourself—so you can create more kinds of content. DALL·E 2 won't allow you to create a huge amount of content, including images of public figures.

The message "It looks like this request may not follow our content policy" in response to the prompt "Joe Biden eating a bowl of hearty chili"

DALL·E 2 also adds a multi-colored watermark to the bottom-right corner of your images, though you are allowed to remove it.

A parrot dressed as Sherlock Holmes in DALL-E 2 (with a watermark at the bottom)

DALL·E 2 vs. Stable Diffusion: Which should you use?

While DALL·E 2 is the biggest name in AI image generation, it might make sense to try Stable Diffusion first: it has a free trial, it's cheaper, it's more powerful, and it's got more permissive usage rights. If you go totally off the deep end, you can also use it to develop your own custom generative AI.

When DALL·E 2 had a great free trial, there was a lot to love about its simplicity. If OpenAI brings that back, then it makes sense for anyone who's just curious to see what AI image generators can do.

Either way, the decision doesn't really come down to the quality of the generated output but rather the overall user experience. Both apps can create awesome, hilarious, and downright bizarre images from the right prompt. And in the end, you might end up using a third-party app built on one of these two models, in which case, you might not even notice the difference.

Related reading:

سوريا الآن

القائمة الرئيسية

الصفحات