Conformer: A Hybrid CNN-Transformer Model for Improved Feature Representation
Follow our public account to discover the beauty of CV technology 0 Introduction In Convolutional Neural Networks (CNN), convolution operations excel at extracting local features, but there are certain limitations in capturing global feature representations. In Vision Transformers, cascading self-attention modules can capture long-range feature dependencies but tend to overlook the details of local features. … Read more