Today, Deepseek is one of the only major AI companies in China that is not dependent on funding from tech giants such as Baidu, Alibaba, and Bytedance.
A young group of geniuses wants to prove themselves
According to Liang, when he put together Deepseek’s research team, he wasn’t looking for experienced engineers to build a consumer product. Instead, he focused on doctoral students at China’s top universities, including Peking University and Tingua University. Although many had been published in top journals and won awards at international academic conferences, they had no industry experience, according to Chinese technology publication Qbitai.
“Our core technical positions are primarily filled by people who graduated this year or in the past year or two,” Liang told 36KR in 2023. Unorthodox research projects. This is a very different approach than established internet companies in China, where teams often compete for resources. (A recent example: Bytedance accused a former intern (a prestigious academic award recipient) of sabotaging his colleagues’ work in order to hoard more computing resources for his team.)
Liang said students are better suited for high-investment, low-profit research. “Most people, when they are young, can be completely dedicated to a mission without utilitarian considerations,” he explained. His pitch to future hires is that Deepseek was created to “solve the world’s toughest questions.”
The fact that these young researchers are almost entirely educated in China adds to their motivation, experts say. “This younger generation also embodies a sense of patriotism, especially as they navigate US restrictions and suffocate suffocation points with critical hardware and software technology,” Zhang explains. “Their determination to overcome these barriers reflects not only their personal ambitions but also a broader commitment to advancing China’s position as a global innovation leader.”
Innovation born from crisis
In October 2022, the U.S. government began finalizing export controls that severely limit Chinese AI companies’ access to cutting-edge chips like NVIDIA’s H100. This move presented a problem for DeepSeek. The company started with a stockpile of 10,000 H100s, but needed more to compete with companies like Openai and Meta. “The problem we are facing has never been funding, but export control of advanced chips,” Liang told 36KR in a second interview in 2024.
Deepseek had to come up with a more efficient way to train their models. “They optimized the model architecture using a series of engineering tricks: existing communication schemes between chips, reducing the size of the fields to save memory, and innovative use of a mixed approach in the model.” Analyst at the Mercator Institute of China Research. “Many of these approaches are not new ideas, but successfully combining them to create state-of-the-art models is an amazing feat.”
Deepseek is also making great strides in mixing Multi-Head Latent Atterness (MLA) and Experts. This is a two-part technical design that makes it more cost-effective to train models with fewer computing resources. In fact, Deepseek’s latest model is so efficient that it required the computing power of Meta’s equivalent Llama 3.1 model to train, according to research institute Epoch AI.
Deepseek’s willingness to share these innovations with the public has generated considerable goodwill within the global AI research community. For many Chinese AI companies, developing open source models is the only way to catch up with their Western counterparts to attract more users and contributors. “They demonstrated that while the current state-of-the-art model is still a lot of money, the current norms of model building leave plenty of room for optimization,” says Chan. “We’re going to see a lot more attempts in this direction going forward.”
This news could spell trouble for current US export controls, which are focused on creating computing resource bottlenecks. “Existing estimates of how much AI computing power China has and what they can accomplish with it could be toppled,” says Zhang.