DistilKitPlus
is an open-source toolkit designed for high-performance knowledge distillation, specifically tailored for Large Language Models (LLMs). It enables you to create smaller, faster, and more efficient models that retain the capabilities of larger teacher models.
Inspired by acree-ai/DistillKit
, this project provides an easy-to-use framework with a particular focus on supporting offline distillation (using pre-computed teacher logits) and Parameter-Efficient Fine-Tuning (PEFT) methods like LoRA, making it suitable for environments with limited computational resources.
peft
library for efficient fine-tuning techniques like LoRA.bitsandbytes
for reduced memory footprint and potentially faster execution.accelerate
and deepspeed
for scaling training across multiple GPUs or nodes.fkl
), Universal Logit Distillation (uld
), and Multi-Level Optimal Transport (multi-ot
).