DeepSeek's R1: A New Contender in AI
An artificial intelligence lab based in Hangzhou, east China's Zhejiang Province, has set Silicon Valley abuzz with the release of its state-of-the-art model. Trained at a fraction of the cost of mainstream models like OpenAI's ChatGPT, DeepSeek's R1 model represents a significant breakthrough in AI development.
Founded by hedge fund manager Liang Wenfeng, DeepSeek unveiled its R1 model last Monday, accompanied by a detailed paper outlining how to train a large-scale reinforcement learning (RL) model without relying on supervised fine-tuning (SFT) as a preliminary step.
Within days, DeepSeek's app soared to the top of the iPhone free app charts in both China and the United States, surpassing the once-dominant ChatGPT. The rapid ascent of DeepSeek has ignited a heated debate in Silicon Valley about whether better-resourced U.S. AI companies, including Meta and OpenAI, can maintain their technological advantage.
Challenges to U.S. Technological Dominance
The breakthrough has drawn criticism from many AI experts online, who describe it as counterproductive to the United States' efforts to curb China's high-tech ambitions. The ability of DeepSeek to achieve such results with limited resources challenges the notion that only well-funded Western companies can lead in AI innovation.
\"DeepSeek's approach demonstrates that innovation isn't solely dependent on massive funding,\" said an unnamed AI researcher. \"It's about novel ideas and efficient methodologies.\"
Liang Wenfeng: The Man Behind the Innovation
Meanwhile, Liang has become a focal point of discussion in China. Last week, he was invited to a symposium in Beijing, where Chinese Premier Li Qiang sought opinions and suggestions from experts, entrepreneurs, and representatives across various sectors, including education, science, culture, health, and sports, on a draft government work report.
Liang graduated from Zhejiang University with a degree in Artificial Intelligence. He co-founded the quantitative hedge fund High-Flyer in 2016, which quickly gained recognition for its innovative use of AI-driven trading strategies. By 2021, High-Flyer had fully integrated AI into its operations, using machine learning models to predict market trends and make data-driven investment decisions.
DeepSeek's Commitment to 'Long-Termism'
In May 2023, Liang founded DeepSeek, aiming to advance the field of artificial general intelligence (AGI). Unlike traditional for-profit ventures, DeepSeek was envisioned as a platform for long-term, fundamental research, where curiosity-driven exploration could drive meaningful advancements in AI.
For Liang, DeepSeek is more like a side project or hobby, driven by deep curiosity and a commitment to foundational research. He acknowledges that basic research often yields low immediate returns on investment, yet he is captivated by the challenge of exploring complex fields like finance and the potential of AGI.
Liang's focus is on understanding the essence of human intelligence and the processes that underlie it, believing that such exploration is crucial despite the lack of immediate commercial incentives. He has remained low-profile, granting interviews only to Anyong, a sub-brand of China's commercial tech media 36Kr, in 2023 and 2024.
A Glimpse into the Future
The emergence of DeepSeek and its R1 model signals a shift in the global AI landscape. As startups like DeepSeek demonstrate significant advancements with fewer resources, the competition in AI innovation is intensifying.
Whether DeepSeek's approach will redefine the methodologies in AI development remains to be seen. However, the global AI community is watching closely as Liang Wenfeng and his team continue to challenge existing paradigms and push the boundaries of what's possible.
Reference(s):
Behind China's rising AI startup DeepSeek: Who is Liang Wenfeng?
cgtn.com