logo
0
0
WeChat Login
Lightx2v<lightx2v@users.noreply.huggingface.co>
change 'LoRA' tag to lowercase 'lora' (#3)

🎬 Wan2.1 Distilled Models

⚡ High-Performance Video Generation with 4-Step Inference

Distillation-accelerated versions of Wan2.1 - Dramatically faster while maintaining exceptional quality

image/png


🤗 HuggingFace GitHub License


🌟 What's Special?

⚡ Ultra-Fast Generation

  • 4-step inference (vs traditional 50+ steps)
  • Up to 2x faster than ComfyUI
  • Real-time video generation capability

🎯 Flexible Options

  • Multiple resolutions (480P/720P)
  • Various precision formats (BF16/FP8/INT8)
  • I2V and T2V support

💾 Memory Efficient

  • FP8/INT8: ~50% size reduction
  • CPU offload support
  • Optimized for consumer GPUs

🔧 Easy Integration

  • Compatible with LightX2V framework
  • ComfyUI support available
  • Simple configuration files

📦 Model Catalog

🎥 Model Types

🖼️ Image-to-Video (I2V)

Transform still images into dynamic videos

  • 📺 480P Resolution
  • 🎬 720P Resolution

📝 Text-to-Video (T2V)

Generate videos from text descriptions

  • 🚀 14B Parameters
  • 🎨 High-quality synthesis

🎯 Precision Variants

PrecisionModel IdentifierModel SizeFrameworkQuality vs Speed
🏆 BF16lightx2v_4step~28-32 GBLightX2V⭐⭐⭐⭐⭐ Highest quality
FP8scaled_fp8_e4m3_lightx2v_4step~15-17 GBLightX2V⭐⭐⭐⭐ Excellent balance
🎯 INT8int8_lightx2v_4step~15-17 GBLightX2V⭐⭐⭐⭐ Fast & efficient
🔷 FP8 ComfyUIscaled_fp8_e4m3_lightx2v_4step_comfyui~15-17 GBComfyUI⭐⭐⭐ ComfyUI ready

📝 Naming Convention

# Pattern: wan2.1_{task}_{resolution}_{precision}.safetensors # Examples: wan2.1_i2v_720p_lightx2v_4step.safetensors # 720P I2V - BF16 wan2.1_i2v_720p_scaled_fp8_e4m3_lightx2v_4step.safetensors # 720P I2V - FP8 wan2.1_i2v_480p_int8_lightx2v_4step.safetensors # 480P I2V - INT8 wan2.1_t2v_14b_scaled_fp8_e4m3_lightx2v_4step_comfyui.safetensors # T2V - FP8 ComfyUI

💡 Explore all models: Browse Full Model Collection →

🚀 Usage

LightX2V is a high-performance inference framework optimized for these models, approximately 2x faster than ComfyUI with better quantization accuracy. Highly recommended!

Quick Start

  1. Download model (720P I2V FP8 example)
huggingface-cli download lightx2v/Wan2.1-Distill-Models \ --local-dir ./models/wan2.1_i2v_720p \ --include "wan2.1_i2v_720p_scaled_fp8_e4m3_lightx2v_4step.safetensors"
  1. Clone LightX2V repository
git clone https://github.com/ModelTC/LightX2V.git cd LightX2V
  1. Install dependencies
pip install -r requirements.txt

Or refer to Quick Start Documentation to use docker

  1. Select and modify configuration file

Choose the appropriate configuration based on your GPU memory:

For 80GB+ GPU (A100/H100)

For 24GB+ GPU (RTX 4090)

  1. Run inference
cd scripts bash wan/run_wan_i2v_distill_4step_cfg.sh

Documentation

Performance Advantages

  • Fast: Approximately 2x faster than ComfyUI
  • 🎯 Optimized: Deeply optimized for distilled models
  • 💾 Memory Efficient: Supports CPU offload and other memory optimization techniques
  • 🛠️ Flexible: Supports multiple quantization formats and configuration options

Community

⚠️ Important Notes

  1. Additional Components: These models only contain DIT weights. You also need:

    • T5 text encoder
    • CLIP vision encoder
    • VAE encoder/decoder
    • Tokenizers

    Refer to LightX2V Documentation for how to organize the complete model directory.

If you find this project helpful, please give us a ⭐ on GitHub