引用Merrill, W. and Sabharwal, A. The parallelism tradeoff: Limitations of log-precision transformers. Transactions of the Association for Computational Linguistics.Feng, Guhao, et al. "Towards revealing the mystery behind chain of thought: a theoretical perspective." Advances in Neural Information Processing Systems 36 (2024).