Accelerating Large Language Models through Partially Linear Feed-Forward Network

Published in Eurosys 26, 2025

Recommended citation: Hu G, Wang Z, Wei J, et al. Accelerating Large Language Models through Partially Linear Feed-Forward Network[J]. arXiv preprint arXiv:2501.10054, 2025.