Discussion on Absolute, Relative, and Rotational Position Encoding in Transformers

Discussion on Absolute, Relative, and Rotational Position Encoding in Transformers

Click the card below to follow the “AI Frontier Express” public account Various important resources delivered promptly Reprinted from Zhihu: Yao Yuan Link: https://zhuanlan.zhihu.com/p/17311602488 1. Introduction The attention mechanism in Transformer [1] can effectively model the correlations between tokens, achieving significant performance improvements in many tasks. However, the attention mechanism itself does not have the … Read more