Scaling DoRA: High-Rank Adaptation via Factored Norms and Fused Kernels

Alexandra Zelenin, Alexandra Zhuravlyova|March 23, 2026arXiv

Key Takeaway

If you're using DoRA for high-rank fine-tuning on limited GPU memory, these optimizations make it practical by cutting peak memory usage by up to 7 GB and doubling speed without changing the model's behavior.

Summary

DoRA is a fine-tuning method that adapts model weights by separating magnitude from direction, but computing its forward pass requires materializing large dense matrices that consume massive GPU memory.

efficiency training

Key Terms

dora lora fused-kernels factored-norm