Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View

Publication
In arXiv:1906.02762