At the AI Action Summit 2025 in Paris, Chinese AI company DeepSeek showcased its remarkable ability to thrive despite stringent US chip export restrictions. With limited access to cutting-edge AI chips like NVIDIA's H100, DeepSeek has pioneered innovative techniques to maximize efficiency and maintain high performance.
DeepSeek's success hinges on optimizing operations under hardware constraints. By minimizing computational waste and maximizing GPU cycles, the company ensures that every aspect of its AI models is utilized effectively. This approach is exemplified in their Mixture of Experts (MoE) strategy, which segments the model into specialized experts, activating only those relevant to each task. This method significantly reduces computational overhead compared to traditional models.
Another breakthrough is DeepSeekMLA (Multi-head Latent Attention), a technique that compresses memory usage by prioritizing key contextual information. This allows the model to process fewer, more relevant data points without sacrificing performance. Additionally, DeepSeek employs Precision Optimization, storing parameters in FP8 instead of higher-precision formats like BF16 or FP32. This reduces memory requirements while maintaining accuracy, akin to replacing high-resolution images with detailed sketches.
Faced with the limitations of NVIDIA's H800 GPUs, DeepSeek engineers took a bold step by bypassing the default CUDA scheduling system. Instead, they utilized PTX (Parallel Thread Execution) for more granular control over GPU tasks, enhancing performance despite reduced cross-GPU communication bandwidth. This innovative approach demonstrates that even with restricted hardware, high efficiency in AI training is achievable.
DeepSeek's advancements not only highlight the resilience of Chinese tech companies under export controls but also signal a potential shift in the global AI industry's landscape. As more companies explore alternatives to traditional GPU ecosystems, the future of AI development may see increased diversification and competition.
Reference(s):
Catalyst DeepSeek: The innovation behind its cost efficiency
cgtn.com