👋 Hi!
I'm an undergrad Generative Model researcher from Zhejiang University, currently focusing on Unified Multi-modal Model.
I'm now a research intern at BAIR.
👋 Hi!
I'm an undergrad Generative Model researcher from Zhejiang University, currently focusing on Unified Multi-modal Model.
I'm now a research intern at BAIR.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
[NeurIPS 2025] Image editing is worth a single LoRA! 0.1% training data for fantastic image editing! Surpasses GPT-4o in ID persistence~ MoE ckpt released! Only 4GB VRAM is enough to run!
Official repo of paper "Reconstruction Alignment Improves Unified Multimodal Models". Unlocking the Massive Zero-shot Potential in Unified Multimodal Models through Self-supervised Learning.