A Survey: Collaborative Hardware and Software Design in the Era of Large Language Models

Abstract

The rapid development of large language models (LLMs) has significantly transformed the field of artificial intelligence, demonstrating remarkable capabilities in natural language processing and moving towards multi-modal functionality. These models are increasingly integrated into diverse applications, impacting both research and industry. However, their development and deployment present substantial challenges, including the need for extensive computational resources, high energy consumption, and complex software optimizations. Unlike traditional deep learning systems, LLMs require unique optimization strategies for training and inference, focusing on system-level efficiency. This paper surveys hardware and software co-design approaches specifically tailored to address the unique characteristics and constraints of large language models. This survey analyzes the challenges and impacts of LLMs on hardware and algorithm research, exploring algorithm optimization, hardware design, and system-level innovations. It aims to provide a comprehensive understanding of the trade-offs and considerations in LLM-centric computing systems, guiding future advancements in AI. Finally, we summarize the existing efforts in this space and outline future directions toward realizing production-grade co-design methodologies for the next generation of large language models and AI systems.

Methodology: Image

Survey on Hardware and Software Co-design for LLM
Results: Image

Summary of Accelerators for LLM Inference.
Citation: @misc{guo2024surveycollaborativehardwaresoftware,
title={A Survey: Collaborative Hardware and Software Design in the Era of Large Language Models},
author={Cong Guo and Feng Cheng and Zhixu Du and James Kiessling and Jonathan Ku and Shiyu Li and Ziru Li and Mingyuan Ma and Tergel Molom-Ochir and Benjamin Morris and Haoxuan Shan and Jingwei Sun and Yitu Wang and Chiyue Wei and Xueying Wu and Yuhao Wu and Hao Frank Yang and Jingyang Zhang and Junyao Zhang and Qilin Zheng and Guanglei Zhou and Hai and Li and Yiran Chen},
year={2024},
eprint={2410.07265},
archivePrefix={arXiv},
primaryClass={cs.AR},
url={https://arxiv.org/abs/2410.07265},
}