Last Updated | Changes |
9/15/2023 | First version published |
3/4/2024 | Updates! |
7/20/2024 | Huge Update! |
1/24/2025 | Content refresh |
Generative AI, & Stable Diffusion
What is “Generative AI,” and how does Stable Diffusion fit into it? You might have heard the term Generative AI in the media – it’s huge right now; it’s on the news, it’s on the app-stores, Elon Musk is Tweeting about it – it’s beginning to pervade our lives! In this guide we’ll explore what Generative AI is, and how YOU can can start making Generative AI Art!
Generative AI refers to the use of machine learning algorithms to generate new data that is similar to the data fed into it. This technology has been used in a variety of applications, including art, music, and text generation. The goal of generative AI is to allow machines to create something new and unique, rather than simply replicating existing data.
- Stable Diffusion is one example of generative AI that has gained popularity in the art world, allowing artists to create unique and complex art pieces by entering text “prompts”.
- Chat GPT is another example of generative AI – a language model that can generate human-like text. It is capable of completing sentences, paragraphs, and even entire articles, given a short prompt. This technology is being used in a variety of applications, including chatbots, content creation, and even computer programming. I used it to write this paragraph in ~1 second. It’s a great general-purpose tool. Similar tools include Google’s Gemini, and Microsoft’s Copilot.
- GitHub Copilot is an example of a generative AI language model built for a specific purpose – to assist with programming and coding.
This guide will specifically cover generating image/art content with Stable Diffusion, but will touch on other Generative AI art services.
The Background
In mid-2022, the art world was taken by storm with the launch of several AI-powered art services, including Midjourney, Dall-E, and Stable Diffusion. These services and tools utilize cutting-edge machine learning technology to create unique and innovative art that challenge traditional forms and blur the lines between human and machine creation.
The impact of AI art on the industry has already been significant. Many artists and enthusiasts are exploring the possibilities of this new medium, while many fear the repercussions for established artists’ careers.
Many art portfolio websites have developed new policies that prohibit the display of AI-generated work.
Some websites require artists to disclose if their work was created using AI, and others have even implemented software that can detect and block AI-generated art.
Civitai.com entered the scene in November 2022 as a place for AI Creators to share their work and discover new resources.
The Companies
There are many big-players in the AI art world – here are a few names you’ll often see mentioned;
- Civitai.com – Civitai – that’s us! Founded in November 2022, Civitai has rapidly become the most popular platform for sharing and discovering AI-generated art models and resources, but we’re more than that! We’re a social hub with a vibrant community of AI Creators!
- OpenAI – A research laboratory with both for and non-profit subsidiaries, focusing on the development of AI, in an open and responsible manner. Founded by technology investors (including Peter Thiel and Elon Musk) in 2015, OpenAI has created some highly advanced generative AI models, such as GPT-3, and the recently announced GPT-4, which are highly regarded for their language processing and generation abilities.
- Stability AI – The brainchild of ex-CEO Emad Mostaque, Stability AI is focused on the creation of AI tools, models, and resources. Stability AI is behind the 2022 releases of the Stable Diffusion 1.4/1.5, Stable Diffusion 2.0, and Stable Diffusion 3.5 text-to-image models, and a number of other technologies.
- RunwayML – One of the companies behind the original Stable Diffusion model, RunwayML now provide a platform for artists to use machine learning tools in intuitive ways without any coding experience.
- Black Forest Labs (BFL) – Founded by former Stability AI engineers in August 2024, BFL shook up the AI art community with the release of their groundbreaking “Flux” suite of text-to-image models. Positioned as direct competitors to Stable Diffusion, these models quickly gained attention for their exceptional quality and fidelity, setting a new standard in the space. Check out our Flux Quickstart Guide here!
Controversies
There are already a number of lawsuits challenging various aspects of the technology. Microsoft, GitHub and OpenAI are currently facing a class-action lawsuit, while Midjourney and Stability AI are facing a lawsuit alleging they infringed upon the rights of artists in the creation of their products.
Whatever the outcome, Generative AI is here to stay.
How Does Stable Diffusion Work?
That is an incredibly complex topic, and we’ll just touch on it very briefly here at a very very high level;
(Forward) Diffusion is the process of slowly adding random pixels (noise) to an image until it no longer resembles the original image, and is 100% noise – we’ve diffused, or diluted, the original image.
By reversing that process, we can reproduce something similar to the original image. There is obviously a lot more going on in the process, but that’s the general idea; we input text, the “model” processes that text, generates it from the “diffused” image, and displays an appropriate output image.
Simple!
How can I make Start Making Generative AI Art?
There are a number of tools to generate AI art images, some more involved and complex to set up than others. The easiest method is to use a web-based image generation service, where the code and hardware requirements are taken care of for you, but there’s often a fee involved.
Alternatively, if you have the required hardware (ideally an NVIDIA graphics card), you can create images locally, on your own PC, with no restriction, using Stable Diffusion, and a number of other models.
When we talk about Stable Diffusion, we’re talking about the underlying mathematical/neural network framework which actually generates the images. We need some way to interface with that framework in a user-friendly way – that’s where the following tools come in;
I don’t have a PC or a Graphics Card (GPU)! How can I make AI Generated Art!?
Don’t panic! Civitai.com has one of the best on-site Image Generator services around, with access to many advanced features found in the most popular local interfaces.
Check out some of our user-generated images, made on-site at Civitai.com!
Unlike all the other web-based services out there, we have direct access to the largest repository of Models and additional resources with which to build your images. Getting started with the Civitai Generator is extremely simple, we have a detailed guide to walk you through the process, and a Discord Community of almost 100,000 AI art enthusiasts, eager to share their image prompting knowledge, tips and tricks!
To run on your own PC – Local Interfaces
This guide is extremely high level and won’t get into the deep technical aspects of installing (or using) any of these applications, but if you’d like to run Stable Diffusion (or one of the number of competing image generation technologies) on your own PC (a local install) there are many options!
Note that to get the most out of any local installation of Stable Diffusion you need an NVIDIA graphics card.
Images can be generated using your computer’s CPU alone, or on some AMD graphics cards, but the time it will take to generate a single image will be considerable.
- Automatic1111’s WebUI (Complexity factor 4/5) – WebUI is the most commonly used Interface for Stable Diffusion. It is moderately complex, and has a wide range of plugins and extensions to extend the experience. There’s a great deal of community support available if you have problems.
- ComfyUI (Complexity factor 12/5) – ComfyUI provides an exceedingly complex workflow/node based workspace which requires in-depth knowledge of the Stable Diffusion image generation process to make work. Definitely not a beginner interface, but extremely powerful for the experienced user, with workflows for local image and video generation.
- Cmdr2’s Easy Diffusion (Complexity factor 2/5) – A great option for those starting out with a local install. Easy Diffusion has a 1-click installer for Windows, and a popular Discord server full of extremely knowledgeable people to help you get up and running. The interface itself is limited in what it can do, compared to the other Interfaces, but it remains the easiest way to get started making your own images, locally.
- Fooocus (Complexity factor 2/5) – A very popular interface for creating Stable Diffusion images based on the SDXL model. It’s practically a one-click install on Windows and produces absolutely beautiful images. It does need a moderately powerful GPU (graphics card). Read our Fooocus Quickstart Guide here.
- InvokeAI (Complexity factor 3/5) – A popular open-source text-to-image and image-to-image interface with powerful tools, not yet as full featured as Automatic1111’s WebUI, but getting close.
- SD.Next (Complexity Factor 4/5) – Also known as “Vlad” (after author, Vladmandic), SD.Next started as a “fork” of Automatic1111 WebUI, but has diverged considerably and has a wide range of advanced features.
To run on your own Mac – Local Interfaces
Mac owners can run Automatic1111’s WebUI, InvokeAI, and also some Mac-specific, popular, lightweight, and super simple interfaces;
- DiffusionBee (Complexity factor 1/5) – DiffusionBee is an extremely lightweight MacOS interface for Stable Diffusion. It allows for basic image generation, but has a very small feature-set, to keep it as simple as possible.
- Draw Things App – (Complexity factor 1/5) – Draw Things is a popular and highly rated MacOS App. I don’t know much about it, but from anecdotal evidence it seems to have some good features!
I now have an interface! What are “models”?
Checkpoints, also known as “weights” or “models” are the brains which produce our images. Each model can produce a different style of image, or a particular theme or subject. Some are “multi-use” and can produce a mix of portrait, realistic, and anime (for example), and others are more focused, only capable of rendering one particular style of subject.
Models come in two file types. It’s important to know the distinction if running a local Stable Diffusion interface, as there are security implications.
Pickletensor (.ckpt extension) models may contain and execute malicious code when downloaded and used. Many websites, including Civitai, have “pickle scanners” which attempt to scan for malicious content. However, it’s safer to download Safetensor (.safetensor) models when available. This file type cannot contain any malicious code and is inherently safe to download. As of January 2025, .ckpt (“Pickletensor”) files have ceased to be produced or widely shared, with all models having transitioned to the .safetensor format.
Note that if using a Generation Service you will only be able to use the models they provide. Some services provide access to some of the most popular models while others use their own custom models; it depends on the service.
Along with models there are many other files which can extend and enhance the images generated by the models, including LoRA, Textual Inversion, and Hypernetworks. We’ll look at those in a more in-depth guide
Watch our video, below, for a walkthrough of these core concepts!
Where do I get models?
Most stable diffusion interfaces come with the default Stable Diffusion models, SD1.4 and/or SD1.5, and SDXL. These are the Stable Diffusion models from which most other custom models are derived and can produce good images, with the right prompting.
Custom models, models “trained” with new images to produce stunning styles and specific content, or “merged” models – models created from the combination of others, can be downloaded from Civitai.com!
You are here! Civitai is the leading model repository for Stable Diffusion checkpoints, and other related Generative AI tools. There are tens of thousands of models to choose from, across many categories; something for everyone!
Other Generative AI Services?
Generative AI is a huge field, with many applications. Some of the most popular and interesting tools include;
- ChatGPT – Mentioned above, OpenAI’s ChatGPT is what’s known as an LLM (Large Language Model), designed to provide conversational responses to input text, understand and answer questions, provide recommendations, generate content, and more. It can solve problems, write code – it’s extremely useful, and free (with limitations). The first local models for ChatGPT-like LLMs are now appearing, which can be used on your own PC without restriction – see our Quickstart Guide to Large Language Models to get started!
- Magnific.ai – An out of this world “Upscaling” tool (upscaling is the process of taking a low quality/resolution image and enlarging or enhancing it to a larger size) which has wowed the internet recently with the quality of the output images;
Check out our Guide to Upscaling with Stable Diffusion to learn how to Upscale your images on Civitai!
- OpenAI’s Sora – Sora is one of the many recently released generative AI models that can create realistic and imaginative video scenes from text instructions (also known as txt2video).