Rage Against the (Art) Machine: A Deeper Look Into AI Art Generators
In the wake of the breakthrough of NFTs in the Contemporary Art market comes a new innovation in digital art.
In recent months, a plethora of art generators powered by artificial intelligence have hit the mainstream, and they are as fascinating as they are entertaining. I had the pleasure of using several of the most popular tools currently available in order to test the accuracy of the AI and user experience.
Read on below for more information and final thoughts.
Gaining popularity earlier this year, DALL-E mini by crayon.com is an AI model that generates images from any prompt.
By far the most rudimentary of all the image generators I tried, the image results are a collection of the bizarre and questionable interpretations of my thoughts.
How To Use DALL-E Mini
DALL-E Mini is one of the simpler AI image generators as a user. The bot requires no email sign up or login, and anyone can try it for free.
When accessing the site there is a text box to type in your desired prompt. After the model runs the text it will generate nine different images. Users also have the option to screenshot and download either the collection or individual images.
Cons of DALL-E Mini
I am going to begin with the cons: the model, being one of the first beta options of an AI image generator, is incredibly slow. Some prompts take over a minute or two to produce an image.
DALL-E mini’s team also identified in a disclosure that there is a bias when it comes to image scanning. Because the model scans from the internet, some of the image results may reinforce societal biases and stereotypes.
Another pitfall of the DALL-E Mini model is the image quality. It is bad. The images come out distorted and sometimes far from the prompt. The model has a lot of difficulty in generating human faces and hands specifically — and while these were common difficulties across the board, DALL-E mini was almost incapable of rendering a decent face.
Pros of DALL-E Mini
The pro side of DALL-E Mini is that despite all the faults, it is wildly entertaining, free, and easy to use.
My rating thus is a 5/10.
Wombo Dream is a free AI image generator designed to make it easy to create visually appealing AI creations. This model has a select number of built-in styles to choose from, along with a prompt to generate images. Wombo Dream also includes an easy built-in integration to mint your result as an NFT.
How To Use Wombo Dream
Wombo Dream is easy to use, and has both a browser version and a mobile app. There is a text box to type in the prompt to generate, as usual, However the difference here is that as a user you need to select one of the predetermined styles you have the image generated in. There is also an option to upload an image as a reference for the final output.
The result is a single image presented in trading card style. What sets Wombo Dream apart from others is that a user can easily connect an NFT wallet and mint the end results as an NFT.
Pros of Wombo Dream
The pro of Wombo Dream is the integration of the already existing NFT market into the service. It is a unique opportunity for a user to try their luck at NFT artistry. It’s also a benefit that this is an entirely free service and does not require the use of credits. Wombo is best suited to evocative, moody creations rather than building an image for a specific vision or purpose.
Cons of Wombo Dream
Now onto the Cons. I personally did not enjoy the limitations brought on by the slim choices of art styles to choose from. The presets left me underwhelmed by the results of the images.
Another con is that Wombo struggles enormously to create anything representative, and complex prompts will often struggle. Overall 6/10.
DALL-E 2 is the more robust version of the DALL-E Mini AI. Though having a similar user experience to DALL-E Mini, DALL-E 2 is much more refined in the results of their image generations and provides more diverse options in its interpretations.
How To Use DALL-E 2
DALL-E 2 was released in March 2022 to a limited number of users, with the stated goal of reducing overcrowding on the site and not overworking the model.
Because of that, there is a waitlist to access the model. Once a user is granted access, it’s a straightforward process to create an account and start writing prompts.
During the first month of use, a user receives 50 free credits and 15 credits every subsequent month. These credits are used in exchange for the model to generate images. Additional credits can be purchased at a rate of 115 credits for $15.
At a much quicker speed than most of its competitors, the model will interpret a prompt and return with four images. These four are each a unique take on the prompt, providing a lot more diversity than other AI generators — both in terms of the variety of images and compositions as well as racial and gender diversity.
It’s important to note that the more highly specific the prompt is, the better the images return.
DALL-E 2 also has a tab so users can save their favorite image generations into a collection. The collection can be thought of as a personal gallery of sorts.
Cons Of DALL-E 2
My main issue with DALL-E 2 is with rendering images with words. If you ask the model to produce a sign with a phrase or a poster, it will return jumbled-up letters or incoherent sentences.
It also carries the same issue with biases that the mini version had in its generations.
Pro of DALL-E 2
A pro of DALL-E 2 is that it generates painterly style images beautifully. Whether it is an art movement or referencing a specific artist, the model does a good job of translating it with the other portions of the prompt.
Along with this, DALL-E 2 provides ease with deciphering the prompt and delivering a result that is arguably spot on with what was asked of the AI.
My overall experience with DALL-E 2 was an 8/10.
Midjourney describes itself as “an independent research lab exploring new mediums of thought and expanding the imaginative powers of the human species.” The result of this project is one of the most beautiful AI image generators available today.
How To Use Midjourney
The user experience of this generator was relatively easy.
Once you go through the sign-up process and get on their Discord server, you’ll be randomly assigned to a newbie channel. There’s a help guide for beginners, but it was remarkably unhelpful. It took me several attempts to understand the mechanics of operating the bot.
However, after getting a handle on the rules, I started playing around with Midjourney.
In the server channel is a never-ending scroll of people’s imaginative dreamscapes and alternative universes. That feature is probably what I found most interesting — the ability to see other users’ prompts and their results.
For each prompt, the bot will generate four images from the prompt. Below you can browse some of my favorites.
Pros of Midjourney
The quality of the images from the first generation is remarkable. Almost every image I saw, though they might have some distortion or missed elements of the prompts, was aesthetically beautiful to look at.
The bot also offers users the ability to either enhance or re-generate a specific image from the four, and unlike DALL-E 2 and Nightcafe, every attempt and variation does not use up a limited pool of credits.
Cons of Midjourney
Although Midjourney as an AI art generator produces beautifully crafted images, there are some limitations that come with the bot.
For example, the images are limited in the variations provided from the initial generation. While DALL-E 2 can give a wide range of interpretations of one prompt, Midjourney’s panel of four options tends to look quite similar.
Also, there is an inherent bias when it comes to the image generations with scanning similar images from the algorithm that it picks up specific stereotypes and reinforces them in their results.
Finally, although the interaction of being able to see other channel members’ generations is a nice community-building asset, I did find it influenced my own creativity when coming up with prompts.
Overall, I would give Midjourney a 9/10.
NightCafe is another model used for AI-generating art. Using NightCafe reminds me of playing with an AI-updated version of Microsoft paint.
NightCafe has a free option with limited credits to begin and then the option to purchase more credits or a pro version on a monthly subscription basis.
How To Use NightCafe
NightCafe is very similar to Wombo Dream in the sense that the user experience of adding a prompt and then selecting from a number of pre-selected styles. However, there is also an additional toggle between using an artistic or coherent algorithm.
These two algorithms will produce a different stylized version of the image. Artistic will create an image with beautiful textures that could be more original and appealing to the eye. Coherent, on the other hand, will compose an image better to the prompt.
The model will produce only one image, but you have the option to evolve your creation, either into a new image or even into a video short.
Pros of NightCafe
Although my previous qualm with Wombo was the limited options of artistic styles, it did not bother me as much in NightCafe because of the options to play with the algorithms. I think this gives more diversity and creative freedom to the model, enabling it to produce more artistically interesting creations.
Another pro is the ability to integrate and change the style of an already existing image and the transformation of the creations into video shorts. That was just so fun, in my opinion. NightCafe itself offers more options to experiment.
Cons of NightCafe
The cons with Night Cafe are fairly similar to all the other models. The AI is still not the best with producing human faces, and struggles with certain prompts and producing a clear image. NightCafe does not give the best results to specific prompts, but offers the opportunity to delight and surprise.
I don’t think many images from NightCafe will be as good as Midjourney, but it’s a close second. My final con is that they should provide more free credits when first trying the model.
Overall, I would give my experience with NightCafe a 9/10.
The experience of using these models was, truthfully, more enjoyable than I had originally thought and gave greater insight on the state of where we are in terms of AI art — and what it can become.
The art and images that I saw on Midjourney to me really stood more as art than they did just images, while DALL-E 2 is better at representing a specific concept from a prompt. Unlike DALL-E 2, Midjourney, NightCafe and Wombo feel more mutually creative between the prompt and the AI, although Wombo is by far the most rudimentary model. And in the end, NightCafe is becoming my new favorite pastime.
AI-generated art is an exciting field posing enormous legal, artistic, and philosophical questions. Experimenting with these tools was ultimately very fun and rewarding, despite their limitations. (The fear of deepfakes coming from these may still be overrated — none of them were great at creating a highly realistic image of a real person, especially compared to a simple human skilled in old-fashioned Photoshop.) The potential for what AI can do is expanding every day, and it is up to us to keep it as a tool benefiting our creative instincts, rather than limiting the prospects of working with human artists.