🧨 Generate Diffusers Inference code snippet tailored to your machine

Enter a Hugging Face Hub repo_id and your system specs to get started for inference. This tool uses Gemini to generate the code based on your settings. This is based on sayakpaul/auto-diffusers-docs.

Hugging Face Repo ID

The model repository you want to analyze.

Gemini Model

Select the model to generate the analysis.

Free System RAM (GB)

Free GPU VRAM (GB)

Calculate using 32-bit precision instead of 16-bit.

Disable BF16 (Use FP32)

Consider 8-bit/4-bit quantization.

Allow Lossy Quantization

Model is compatible with torch.compile.

torch.compile() friendly

Model and hardware support FP8 precision.

fp8 friendly

Generated Code

Your results will appear here...

⛔️ Disclaimer: Large Language Models (LLMs) can make mistakes. The information provided is an estimate and should be verified. Always test the model on your target hardware to confirm actual memory requirements.