Who is LoRA?

Large language models (like ChatGPT) have billions of trainable parameters. LoRA locks pre-trained weights and fine-tunes model output based on a much smaller number of more relevant parameters.

Markus Pope

Large language models (like ChatGPT) have billions of trainable parameters. LoRA fine-tunes large language models by freezing the original weights and introducing a small set of trainable parameters, enabling efficient adaptation to specific tasks.1

Reducing transfer learning complexity lowers computation and memory costs for fine-tuning models. Other model adaptation techniques lower computation and memory costs at the expense of adding inference latency, making model response slower. LoRA applies low-rank updates to weights at select layers without modifying the entire weight matrix.

Multiple LoRA layers can be trained for specific purposes and swapped in models to change specializations. Beyond natural language processing, LoRA's approach to efficient fine-tuning holds potential for applications in fields like computer vision and graph neural networks. Because LoRA is low rank compared to full-fine tuning, the technique may have difficulty capturing very specific tasks and patterns.

low-rank-adaptation-of-llm.png

The major downside of fine-tuning is that the new model contains as many parameters as the original model.

Related Methods
Pre-LoRA (2021): "Adapters", prompt fine-tuning (like prefix tuning), which is hard.
Post-LoRA -> IA3 (2022), OFT/BOFT (2023)


1 Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., & Chen, W. (2021). LoRA: Low-rank adaptation of large language models. arXiv.
https://arxiv.org/abs/2106.09685

Share Article
Comments
More Posts