An open-source implementation for training LoRA (Low-Rank Adaptation) layers for Qwen/Qwen-Image, Qwen/Qwen-Image-Edit, and FLUX.1-dev models by FlyMy.AI.
Agentic Infra for GenAI. FlyMy.AI is a B2B infrastructure for building and running GenAI Media agents.
🔗 Useful Links:
diffusers16.10.2025
02.09.2025
20.08.2025
09.08.2025
08.08.2025
utils/validate_dataset.py)🚧 Under Development: We are actively working on improving the code and adding test coverage. The project is in the refinement stage but ready for use.
📋 Development Plans:
Requirements:
Clone the repository and navigate into it:
git clone https://github.com/FlyMyAI/flymyai-lora-trainer
cd flymyai-lora-trainer
Install required packages:
pip install -r requirements.txt
Install the latest diffusers from GitHub:
pip install git+https://github.com/huggingface/diffusers
Download pre-trained LoRA weights (optional):
# Qwen LoRA weights
git clone https://huggingface.co/flymy-ai/qwen-image-realism-lora
# FLUX LoRA weights
git clone https://huggingface.co/flymy-ai/flux-dev-anne-hathaway-lora
# Or download specific files
wget https://huggingface.co/flymy-ai/qwen-image-realism-lora/resolve/main/flymy_realism.safetensors
wget https://huggingface.co/flymy-ai/flux-dev-anne-hathaway-lora/resolve/main/pytorch_lora_weights.safetensors
The training data should follow the same format for both Qwen and FLUX models, where each image has a corresponding text file with the same name:
dataset/ ├── img1.png ├── img1.txt ├── img2.jpg ├── img2.txt ├── img3.png ├── img3.txt └── ...
For control-based image editing, the dataset should be organized with separate directories for target images/captions and control images:
dataset/ ├── images/ # Target images and their captions │ ├── image_001.jpg │ ├── image_001.txt │ ├── image_002.jpg │ ├── image_002.txt │ └── ... └── control/ # Control images ├── image_001.jpg ├── image_002.jpg └── ...
my_training_data/ ├── portrait_001.png ├── portrait_001.txt ├── landscape_042.jpg ├── landscape_042.txt ├── abstract_design.png ├── abstract_design.txt └── style_reference.jpg └── style_reference.txt
For FLUX character training (portrait_001.txt):
ohwx woman, professional headshot, studio lighting, elegant pose, looking at camera
For Qwen landscape training (landscape_042.txt):
Mountain landscape at sunset, dramatic clouds, golden hour lighting, wide angle view
For FLUX portrait training (abstract_design.txt):
ohwx woman, modern portrait style, soft lighting, artistic composition
You can verify your data structure using the included validation utility:
python utils/validate_dataset.py --path path/to/your/dataset
This will check that:
To begin training with your configuration file (e.g., train_lora_4090.yaml), run:
accelerate launch train_4090.py --config ./train_configs/train_lora_4090.yaml

To begin training with your configuration file (e.g., train_lora.yaml), run:
accelerate launch train.py --config ./train_configs/train_lora.yaml
Make sure train_lora.yaml is correctly set up with paths to your dataset, model, output directory, and other parameters.
To begin training with your configuration file (e.g., train_full_qwen_image.yaml), run:
accelerate launch train_full_qwen_image.py --config ./train_configs/train_full_qwen_image.yaml
Make sure train_full_qwen_image.yaml is correctly set up with paths to your dataset, model, output directory, and other parameters.
The proposed method was tested on an NVIDIA H200 GPU environment.
After training, you can load your trained model from the checkpoint directory for inference.
Simple Example:
from diffusers import QwenImagePipeline, QwenImageTransformer2DModel, AutoencoderKLQwenImage
import torch
from omegaconf import OmegaConf
import os
def load_trained_model(checkpoint_path):
"""Load trained model from checkpoint"""
print(f"Loading trained model from: {checkpoint_path}")
# Load config to get original model path
config_path = os.path.join(checkpoint_path, "config.yaml")
config = OmegaConf.load(config_path)
original_model_path = config.pretrained_model_name_or_path
# Load trained transformer
transformer_path = os.path.join(checkpoint_path, "transformer")
transformer = QwenImageTransformer2DModel.from_pretrained(
transformer_path,
torch_dtype=torch.bfloat16,
low_cpu_mem_usage=True
)
transformer.to("cuda")
transformer.eval()
# Load VAE from original model
vae = AutoencoderKLQwenImage.from_pretrained(
original_model_path,
subfolder="vae",
torch_dtype=torch.bfloat16
)
vae.to("cuda")
vae.eval()
# Create pipeline
pipe = QwenImagePipeline.from_pretrained(
original_model_path,
transformer=transformer,
vae=vae,
torch_dtype=torch.bfloat16
)
pipe.to("cuda")
print("Model loaded successfully!")
return pipe
# Usage
checkpoint_path = "/path/to/your/checkpoint"
pipe = load_trained_model(checkpoint_path)
# Generate image
prompt = "A beautiful landscape with mountains and lake"
image = pipe(
prompt=prompt,
width=768,
height=768,
num_inference_steps=30,
true_cfg_scale=5,
generator=torch.Generator(device="cuda").manual_seed(42)
)
# Save result
output_image = image.images[0]
output_image.save("generated_image.png")
Complete Example Script:
python inference_trained_model_gpu_optimized.py
Checkpoint Structure:
The trained model is saved in the following structure:
checkpoint/ ├── config.yaml # Training configuration └── transformer/ # Trained transformer weights ├── config.json ├── diffusion_pytorch_model.safetensors.index.json └── diffusion_pytorch_model-00001-of-00005.safetensors └── ... (multiple shard files)
For control-based image editing training, use the specialized training script:
accelerate launch train_qwen_edit_lora.py --config ./train_configs/train_lora_qwen_edit.yaml
The configuration file train_lora_qwen_edit.yaml should include:
img_dir: Path to target images and captions directory (e.g., ./extracted_dataset/train/images)control_dir: Path to control images directory (e.g., ./extracted_dataset/train/control)To begin training with your configuration file (e.g., train_full_qwen_edit.yaml), run:
accelerate launch train_full_qwen_edit.py --config ./train_configs/train_full_qwen_edit.yaml
from diffusers import DiffusionPipeline
import torch
model_name = "Qwen/Qwen-Image"
# Load the pipeline
if torch.cuda.is_available():
torch_dtype = torch.bfloat16
device = "cuda"
else:
torch_dtype = torch.float32
device = "cpu"
pipe = DiffusionPipeline.from_pretrained(model_name, torch_dtype=torch_dtype)
pipe = pipe.to(device)
from diffusers import QwenImageEditPipeline
import torch
from PIL import Image
# Load the pipeline
pipeline = QwenImageEditPipeline.from_pretrained("Qwen/Qwen-Image-Edit")
pipeline.to(torch.bfloat16)
pipeline.to("cuda")
For Qwen-Image:
# Load LoRA weights
pipe.load_lora_weights('flymy-ai/qwen-image-realism-lora', adapter_name="lora")
For Qwen-Image-Edit:
# Load trained LoRA weights
pipeline.load_lora_weights("/path/to/your/trained/lora/pytorch_lora_weights.safetensors")
You can find LoRA weights here
No trigger word required
prompt = '''Super Realism portrait of a teenager woman of African descent, serene calmness, arms crossed, illuminated by dramatic studio lighting, sunlit park in the background, adorned with delicate jewelry, three-quarter view, sun-kissed skin with natural imperfections, loose shoulder-length curls, slightly squinting eyes, environmental street portrait with text "FLYMY AI" on t-shirt.'''
negative_prompt = " "
image = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
width=1024,
height=1024,
num_inference_steps=50,
true_cfg_scale=5,
generator=torch.Generator(device="cuda").manual_seed(346346)
)
# Display the image (in Jupyter or save to file)
image.show()
# or
image.save("output.png")
# Load input image
image = Image.open("/path/to/your/input/image.jpg").convert("RGB")
# Define editing prompt
prompt = "Make a shot in the same scene of the person moving further away from the camera, keeping the camera steady to maintain focus on the central subject, gradually zooming out to capture more of the surrounding environment as the figure becomes less detailed in the distance."
# Generate edited image
inputs = {
"image": image,
"prompt": prompt,
"generator": torch.manual_seed(0),
"true_cfg_scale": 4.0,
"negative_prompt": " ",
"num_inference_steps": 50,
}
with torch.inference_mode():
output = pipeline(**inputs)
output_image = output.images[0]
output_image.save("edited_image.png")

Input Image:

Prompt: "Make a shot in the same scene of the left hand securing the edge of the cutting board while the right hand tilts it, causing the chopped tomatoes to slide off into the pan, camera angle shifts slightly to the left to center more on the pan."
Output without LoRA:

Output with LoRA:

FLUX.1-dev is a powerful text-to-image model that excels at generating high-quality portraits and character images. Our LoRA training implementation allows you to fine-tune FLUX for specific characters or styles.
To begin FLUX LoRA training with your configuration file, run:
accelerate launch train_flux_lora.py --config ./train_configs/train_flux_config.yaml
Make sure train_flux_config.yaml is correctly set up with paths to your dataset, model, output directory, and other parameters.
from diffusers import DiffusionPipeline
import torch
model_name = "black-forest-labs/FLUX.1-dev"
# Load the pipeline
if torch.cuda.is_available():
torch_dtype = torch.bfloat16
device = "cuda"
else:
torch_dtype = torch.float32
device = "cpu"
pipe = DiffusionPipeline.from_pretrained(model_name, torch_dtype=torch_dtype)
pipe = pipe.to(device)
# Load LoRA weights
pipe.load_lora_weights('flymy-ai/flux-dev-anne-hathaway-lora', adapter_name="lora")
You can find our pre-trained FLUX LoRA weights here
Trigger word required: "ohwx woman"
prompt = '''Portrait of ohwx woman, professional headshot, studio lighting, elegant pose, looking at camera, soft shadows, high quality, detailed facial features, cinematic lighting, 85mm lens, shallow depth of field'''
negative_prompt = "blurry, low quality, distorted, bad anatomy"
image = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
width=1024,
height=1024,
num_inference_steps=30,
guidance_scale=3.5,
generator=torch.Generator(device="cuda").manual_seed(346346)
)
# Display the image (in Jupyter or save to file)
image.images[0].show()
# or
image.images[0].save("output.png")

Below are examples of images generated with our FLUX Anne Hathaway LoRA model:
Prompt: "ohwx woman portrait selfie"

Prompt: "ohwx woman perfectly symmetrical young female face close-up, presented with double exposure overlay blending nature textures like leaves and water"

Prompt: "ohwx woman Macro photography style close-up of female face with light makeup, focused on eyes and lips, illuminated by golden hour sunlight for warm tones"

Prompt: "Close-up of ohwx woman in brown knitted turtleneck sweater. Sitting with big black and white panda, hugging it, looking at camera"

Want to train your own FLUX LoRA model? Use our online training platform:
🚀 Train Your Own FLUX LoRA on FlyMy.AI
Features:
We provide ready-to-use ComfyUI workflows that work with both our Qwen and FLUX trained LoRA models. Follow these steps to set up and use the workflows:
Download the latest ComfyUI:
Install ComfyUI:
Download model weights:
For Qwen-Image:
For FLUX.1-dev:
Place model weights in ComfyUI:
ComfyUI/models/Download our pre-trained LoRA weights:
.safetensors filesPlace LoRA weights in ComfyUI:
ComfyUI/models/loras/Load the workflow:
qwen_image_lora_example.jsonflux_anne_hathaway_lora_example.jsonThe ComfyUI workflows provide a user-friendly interface for generating images with our trained LoRA models without needing to write Python code.

If you have questions or suggestions, join our community:
⭐ Don't forget to star the repository if you like it!