Is Transformer Indispensable? Latest Review on State Space Model (SSM)
In the era following deep learning, the Transformer architecture has demonstrated its powerful performance in pre-trained large models and various downstream tasks. However, the significant computational demands of this architecture have deterred many researchers. To further reduce the complexity of attention models, numerous efforts have been invested in designing more efficient methods. Among these, the … Read more