Publications

You can also find my articles on my Google Scholar profile.

Conference Papers


Accelerating Large Language Models through Partially Linear Feed-Forward Network

Published in Eurosys 26, 2025

This paper presents a novel approach to accelerate large language models through partially linear feed-forward networks, achieving significant speedup while maintaining model performance.

Recommended citation: Hu G, Wang Z, Wei J, et al. Accelerating Large Language Models through Partially Linear Feed-Forward Network[J]. arXiv preprint arXiv:2501.10054, 2025.