First Mamba+Transformer Multimodal Large Model
Source: Algorithm Advancement This article is approximately 4100 words and is recommended to be read in 8 minutes. LongLLaVA performs excellently in long-context multimodal understanding. The authors of this article come from The Chinese University of Hong Kong, Shenzhen, and the Shenzhen Big Data Research Institute. The first authors are PhD student Wang Xidong and … Read more