SpinQuant: LLM Quantization with Learnable Rotation Matrices
↑ ClickBlue Text Follow the Jishi Platform Author丨Tech Beast Editor丨Jishi Platform Jishi Introduction SpinQuant combines learnable rotation matrices to achieve optimal network accuracy, quantizing weights, activations, and KV cache to a 4-bit width. On the LLaMA-2 7B model, SpinQuant reduces the accuracy gap in Zero-Shot inference tasks to only 2.9 points compared to the full-precision … Read more