Lightning Attention-2: Unlimited Sequence Length, Constant Computational Cost, Higher Modeling Accuracy
Lightning Attention-2 is a new type of linear attention mechanism that aligns the training and inference costs of long sequences with those of a 1K sequence length. The limitations on sequence length in large language models greatly restrict their applications in the field of artificial intelligence, such as multi-turn dialogue, long text understanding, and the … Read more