Capability · Framework — fine-tuning
Liger Kernel
Liger Kernel replaces slow default PyTorch ops with fused Triton implementations that are drop-in compatible with Hugging Face Transformers and Axolotl. Teams report 20-30% throughput gains and large memory reductions that let them fit bigger sequences on the same GPUs.
Framework facts
- Category
- fine-tuning
- Language
- Python / Triton
- License
- BSD-2-Clause
- Repository
- https://github.com/linkedin/Liger-Kernel
Install
pip install liger-kernel Quickstart
from liger_kernel.transformers import apply_liger_kernel_to_llama
from transformers import AutoModelForCausalLM, AutoTokenizer
apply_liger_kernel_to_llama() # patches RMSNorm, RoPE, SwiGLU, etc.
model = AutoModelForCausalLM.from_pretrained('meta-llama/Llama-3.1-8B')
# continue training as usual with HF Trainer / TRL / Axolotl Alternatives
- Unsloth — similar memory-efficient kernels
- FlashAttention-3
- xFormers
- Apex fused optimisers
Frequently asked questions
Does Liger need code changes?
No. One call to apply_liger_kernel_to_* patches the target model class. The rest of your Hugging Face or Axolotl training script is unchanged.
Which architectures are supported?
Llama, Mistral, Mixtral, Gemma, Qwen, Phi, and a growing list. Check the README for the current matrix.
Sources
- Liger Kernel — GitHub — accessed 2026-04-20
- Liger Kernel — paper — accessed 2026-04-20