🧨 Generate Diffusers Inference code snippet tailored to your machine

Enter a Hugging Face Hub repo_id and your system specs to get started for inference. This tool uses Gemini to generate the code based on your settings. This is based on sayakpaul/auto-diffusers-docs.

Gemini Model

Select the model to generate the analysis.

Calculate using 32-bit precision instead of 16-bit.

Consider 8-bit/4-bit quantization.

Model is compatible with torch.compile.

Model and hardware support FP8 precision.

  • Try changing to the model from Flash to Pro if the results are bad.
  • Try to be as specific as possible about your local machine.
  • As a rule of thumb, GPUs from RTX 4090 and later, are generally good for using torch.compile().
  • To leverage FP8, the GPU needs to have a compute capability of at least 8.9.
  • Check out the following docs for optimization in Diffusers:

Generated Code

Your results will appear here...


⛔️ Disclaimer: Large Language Models (LLMs) can make mistakes. The information provided is an estimate and should be verified. Always test the model on your target hardware to confirm actual memory requirements.