Accelerating Large Language Models through Partially Linear Feed-Forward Network
Published in Eurosys 26, 2025
Recommended citation: Hu G, Wang Z, Wei J, et al. Accelerating Large Language Models through Partially Linear Feed-Forward Network[J]. arXiv preprint arXiv:2501.10054, 2025.
