If you're using DoRA for high-rank fine-tuning on limited GPU memory, these optimizations make it practical by cutting peak memory usage by up to 7 GB and doubling speed without changing the model's behavior.
DoRA is a fine-tuning method that adapts model weights by separating magnitude from direction, but computing its forward pass requires materializing large dense matrices that consume massive GPU memory.