China AI Chipset: Huawei Ascend vs NVIDIA, 30B RMB Market

China's AI chip market reached approximately 30 billion RMB in 2025, driven by massive demand for training and inference hardware from China's generative AI boom. Huawei's Ascend 910C became the primary domestic alternative to NVIDIA's A100/H100, with Chinese tech giants including Baidu, Alibaba, Tencent, and ByteDance all deploying Ascend clusters. US export controls on advanced AI chips forced Chinese companies to stockpile NVIDIA GPUs while accelerating domestic alternatives. CAMBRICON's MLU370 and Moore Threads' MTT S4000 provide inference solutions, though training capability still lags NVIDIA by 2-3 years. China's AI chip self-sufficiency rate reached approximately 25% for inference but only 5% for large-scale training.

TL;DR

China AI chip market 30B RMB. Huawei Ascend 910C is the main NVIDIA alternative. Domestic self-sufficiency: 25% inference, 5% training. Chinese tech giants all deploying Ascend clusters. US sanctions accelerated domestic development but created training gap.

Key Insights

Huawei Ascend 910C

Closest domestic A100 alternative

Huawei's Ascend 910C offers the closest domestic performance to NVIDIA A100 for AI training workloads. Chinese cloud providers (Huawei Cloud, Baidu AI Cloud) deployed over 100,000 Ascend 910C chips in 2025. Performance for large language model training reaches approximately 80% of A100 efficiency with software optimization. Huawei's CANN software stack matured significantly, supporting PyTorch and MindSpore frameworks.

US Sanctions Impact

NVIDIA H100 banned since Oct 2022

US export controls banned NVIDIA A100/H100 and later L40S from China since October 2022. Chinese companies stockpiled approximately 100,000 high-end NVIDIA GPUs before and after sanctions. NVIDIA created China-specific H20 and L20 chips complying with export rules, offering reduced performance. The sanctions paradoxically accelerated Chinese domestic chip development by creating guaranteed demand.

CAMBRICON MLU Series

3rd largest AI chip company by volume

CAMBRICON's MLU370 achieved mass deployment in Chinese data centers for AI inference workloads, particularly for Baidu's ERNIE and other LLMs. The company shipped over 500,000 AI accelerator chips cumulatively. CAMBRICON's next-generation MLU590 targets training-level performance but faces software ecosystem challenges compared to NVIDIA CUDA.

Inference Market Growth

25% domestic self-sufficiency

China achieved approximately 25% self-sufficiency in AI inference chips, with domestic options from Huawei (Ascend 310), CAMBRICON (MLU), and startup Enflame (Sue/S20). Inference chips are less demanding than training chips, making domestic substitution more feasible. Baidu reported 70% inference workload on domestic chips for ERNIE Bot service.

Side-by-Side Comparison

AI Chip	Company	Process Node	FP16 Performance	Application	Availability
Ascend 910C	Huawei	7nm (SMIC)	~310 TFLOPS	Training	Mass production
Ascend 310P	Huawei	12nm	~70 TFLOPS	Inference	Mass production
MLU370	CAMBRICON	12nm	~256 TFLOPS	Training/Inference	Mass production
MLU590	CAMBRICON	7nm	~400 TFLOPS	Training	Sampling
MTT S4000	Moore Threads	7nm	~200 TFLOPS	Inference	Limited
BI-V150	Biren Tech	7nm	~300 TFLOPS	Training	Limited
Enflame S50	Enflame	7nm	~280 TFLOPS	Training	Sampling
NVIDIA H20	NVIDIA	4nm	~148 TFLOPS	Training (limited)	Available (China-only)

Frequently Asked Questions

Can Huawei Ascend chips fully replace NVIDIA GPUs for AI training?

Huawei Ascend chips cannot yet fully replace NVIDIA GPUs for AI training, though the gap is narrowing for specific workloads: performance gap, the Ascend 910C achieves approximately 80% of NVIDIA A100 performance on optimized workloads, but lags behind H100 by a wider margin of approximately 50%; software ecosystem, this remains the biggest challenge as NVIDIA CUDA has 15+ years of developer ecosystem advantage, while Huawei CANN and MindSpore are relatively new and have a smaller developer community; framework support, Ascend supports PyTorch and TensorFlow through adapters but with performance overhead, while native MindSpore achieves best performance but requires code migration; cluster scaling, Huawei has demonstrated 10,000+ chip clusters matching NVIDIA's DGX SuperPOD for Baidu ERNIE training, but debugging and monitoring tools lag behind NVIDIA's solutions; and practical deployment, major Chinese companies have shifted 40-70% of new AI workloads to Ascend, but critical training runs still rely on stockpiled NVIDIA GPUs. Most industry estimates suggest Huawei will achieve rough parity for training by 2027-2028.