Last Updated | Changes |
11/6/2024 | First Version |
11/14/2024 | Mochi – now on-site at Civitai.com! |
Update! Mochi Available on Civitai.com!
That’s right! We’ve just pushed our first text2img and img2img tools to Civitai.com’s Generator, and Mochi is one of them! Check out the full Guide to Video in the Civitai Generator for details!
Mochi 1?
Mochi 1 preview, by creators Genmo, is an open source, state-of-the-art, video generation model with high-fidelity motion and strong prompt adherence.
This model dramatically closes the gap between closed and open video generation systems, and it’s released under the permissive Apache 2.0 license.
Even better, we can now run it locally on mid-tier consumer GPUs*! A huge development from launch, when it required 3 x H100 GPUs to output results!
The model can currently output videos in 480p, but an HD model is slated to appear later this year.
Output Examples
Mochi in Civitai’s Generator
It’s here! Check out our Guide to Video in the Civitai Generator!
Local Generation with Mochi
Required Files
The official Mochi weights are available on Civitai, along with the Mochi VAE, and text encoders. You may already have one of the required text encoders from a previous SD 3/3.5 installation.
Note that local video generation with these Mochi models is, at time of writing, currently only available via ComfyUI;
Model | Download Location | Download Source |
---|---|---|
Mochi 1 Preview BF16 | models/diffusion_models | Civitai |
Mochi 1 FP8 Scaled | models/diffusion_models | Civitai |
Mochi VAE | ComfyUI/models/vae | Civitai |
T5XXL FP16 | models/clip | Civitai |
T5XXL FP8 e4m3fn Scaled | models/clip | Civitai |
ComfyUI
ComfyUI added native Mochi support in early November, allowing anyone with a consumer GPU to generate locally. 24 GB+ of VRAM is recommended, but we’ve seen reports of Mochi already running on 12GB VRAM systems with some 3rd party nodes and wrappers!
Getting started in ComfyUI is as simple as;
- Updating ComfyUI to the latest version which includes Mochi Support
- Download the weights (see above)
- Download a text encoder (see above)
- Download the VAE (see above)
- Download and load the example workflow (see below) – plug in the models – and start generating!
Sample Workflow
Low VRAM Options
If you have a GPU with less than 24GB of VRAM, you can try the following options to get running, but beware, generation times might be significant!
- Switch to the Mochi 1 FP8 Scaled weights
- Switch to the T5XXL FP8 e4m3fn Scaled text encoder
If you’re still having trouble running on your hardware, you can check out ComfyUI-MochiWrapper by creator Kijai, which offers significant speed boosts at the expense of some quality, via quantized GGUF models and custom nodes;
Kijai Model | Download Source |
---|---|
Mochi 1 preview GGUF Q4 0 V1 | HuggingFace |
Mochi 1 preview GGUF Q4 0 V2 | HuggingFace |
Mochi 1 preview GGUF Q8 0 | HuggingFace |
Mochi 1 preview BF16 VAE Decoder | HuggingFace |
Mochi 1 preview BF16 VAE Encoder | HuggingFace |
Mochi 1 preview FP32 VAE Encoder | HuggingFace |
Limitations
- Pricing! The pricing of Mochi in the Civitai Generator is subject to change as we discuss options with our Partners. We’re aware that the cost for Mochi generation is high, and that’s something we’re hoping to address.
- Complexity! While we now have native ComfyUI support for local generation, this is still a complex model to get running offline. The information above should provide the basics to getting started, but some further reading and experimentation will be required to get the most out of Mochi in an offline environment!
I need more help!
If you’re experiencing issues generating Video with the Civitai Generator and a solution isn’t mentioned on this page, please reach out to our Support team at [email protected]. If you’re having trouble setting up Mochi for local generation, please join the Civitai Discord and seek assistance in the #ai-help channel!