LoRA (Low-Rank Adaptation)
Also known as: LoRA / Low-Rank Adaptation / 低ランク適応
A parameter-efficient fine-tuning method that freezes the original model weights and learns only small low-rank adapter matrices, drastically cutting memory and compute requirements.
Overview
Proposed by Microsoft in 2022, LoRA approximates weight updates as the product of two low-rank matrices, training only ~0.1-1% of a model's total parameters. Original weights stay frozen, so multiple LoRA adapters can be hot-swapped on the same base model.
Practical benefits
GPU memory requirements are a fraction of full fine-tuning — even a 70B-class model can be LoRA-tuned on a consumer RTX 3090. Adapter files are small (tens to hundreds of MB), making per-customer custom adapters practical to manage in production.
Related Columns
Related Terms
Feel free to contact us
Contact Us