Introduction

The project achieved breakthroughs in key technologies such as distributed training and inference of TB-level machine learning models, and large-scale application deployment. The Angel machine learning platform, independently developed in China from underlying hardware to critical software technologies, has significantly advanced the development of physical industries and the digital economy. It has enhanced societal efficiency and generated substantial economic benefits.

Key Technologies of the Angel Platform Overcoming Challenges in UItra-Large-Scale Machine Learning

The Angel machine learning platform addresses the challenges of distributed training and inference for terabyte-scale models, as well as the difficulties in application deployment. It has achieved breakthroughs in three key areas: framework performance, network interconnection, and platform scalability. The platform has developed a high-performance distributed framework, introducing technologies such as unified memory management for GPU and system memory, operator optimization, and parallel computing. It straining performance is 2.6 times that of industry standards. By leveraging shared GPU memory and performance optimizations, inference throughput has been increased, with inference performance being 2.3 times that of industry standards. These advancements significantly enhance the efficiency of model training and inference. Moreover, through the development of end-to-end hardware solutions and efficient software algorithms, a large-scale RDMA high-speed network comprising tens of thousands of GPU units has been constructed. This network achieves high-speed interconnection for thousands of computing nodes,improving communication performance by 30% while reducing   costs by 70%. This effectively meets the computational power demands for large model training and inference. Additionally, the platform realizes linear scalability beyond a single task with over ten thousand GPU cards, reduces communication overhead by80%, achieves a multi-machine multi-GPU acceleration ratio of 99%, GPU utilization rate reaches 62%, surpassing industry standards, and ensures a task stability rate of 99.5% for large models.

Driving Industrial Digitalization and Promoting the Development of the Digital Economy

The Angel machine learning platform developed by Tencent has been widely applied across various sectors. in terms of supporting industrial development, the Angel platform is delivered through Tencent Cloud to serve ecosystems such as Guangzhou Metro and CATL, aiding numerous industries in their digital transformation. Regarding the promotion of digital economy development, Tencent's next-generation advertising system based on the Angel platform has already been employed by leading enterprises such as JD.com Vipshop, and Alibaba. In enhancing societal efficiency, the Angel platform supports Tencent Meeting in providing a more efficient and seamless intelligent online conferencing experience. During the pandemic, it facilitated companies in resuming work and production, with global online conference participation reaching record levels. In the realm of public welfare, the platform leverages its graph network model to effectively combat illicit activities and empowers cultural public welfare initiatives with Al. It also contributes to advanced research areas like drug development and embodied intelligence. Over the past three years, this project has generated direct revenue amounting to 18.238 billion RMB. It not only drives industrial upgrades but also significantly enhances societal operational efficiency, creating substantial economic and social benefits.

Achievements Recognized Internationally, Leading Technological Advancement and Driving Industry Innovation

The project has yielded significant results in advancing technology and leading industry innovation. The technological achievements of the project have frequently topped authoritative international rankings, winning over 20 international competition championships and resulting in the publication of 74 international academic papers. Angel is China's first top-tier Al open-source project to graduate from the Linux Foundation, earning the title of Most Popular ChineseOpen Source Software' in 2019. The platform has attracted numerous external developers and enterprise users,promoting widespread technological application and industry collaboration. Furthermore, Tencent's self-developed full-stack Hunyuan large model based on the Angel platform is among the first from leading Chinese Al research enterprises to adopt advanced Mixture of Experts (MoE) architecture and be deployed into production with a scale exceeding one trillion parameters. Third-party evaluations indicate that Tencent's Hunyuan ranks in the top tier of LLMs in China,with overall performance surpassing the international average for LLMs.

Project Achievements Recognized by Industry Experts and Authoritative Institutions

The project achievement titled "Key Technologies and Applications of the Angel Machine Learning Platform for Large-scale Data" won the first prize in Scientific and Technological Progress awarded by the China Institute of Electronics. Additionally, “Elastic Cloud Network for One Billion Users” won the first prize in Scientific and Technological Progress awarded by Shenzhen. The Angel platform has also supported the construction of a new generation artificial intelligence open innovation platform for Chinese medical imaging. Tencent, serving as the deputy leader of China's Artificial Intelligence Standardization Group, has participated in drafting nine standards for China, industry, and IEEE.


The World Internet Conference (WIC) was established as an international organization on July 12, 2022, headquartered in Beijing, China. It was jointly initiated by Global System for Mobile Communication Association (GSMA), National Computer Network Emergency Response Technical Team/Coordination Center of China (CNCERT), China Internet Network Information Center (CNNIC), Alibaba Group, Tencent, and Zhijiang Lab.