The controversy surrounding DeepSeek’s R1 model training costs shocked the markets, but it appears there was a lot of fraud involved, since the real figures are in fact unexpected.
DeepSeek’s Training Fees Are Said To Be Drastically Higher Than The Documented”$ 5 Million” Number, They Have Access To High-End Hardware
The research organization SemiAnalysis has thoroughly examined the motivations behind DeepSeek in terms of coaching costs, refuting the claim that NVIDIA and other companies aren’t using R1 because it is so successful. Before we dive into the real equipment used by DeepSeek, this take a look at what the business first perceived. It was claimed that DeepSeek just utilized”$ 5 million” for its R1 type, which is on par with OpenAI GPT’s o1, and this triggered a financial stress, which was , however, now that the dust has settled, let’s take a look at the exact figures.
For those unaware, DeepSeek was said to be a side job of the Chinese hedge fund High-Flyer, and the review by SemiAnalysis says that they purchased 10, 000 models of NVIDIA’s A100 again in 2021, when export restrictions weren’t that extreme. DeepSeek then evolved into a separate entity since the family business, High-Flyer, decided to flip the task of, and that’s when things really took off. With that, they started accumulating computing tools, which we’ll discuss next.

The report says that DeepSeek has approximately 10, 000 of NVIDIA’s” China-specific” H800 AI GPUs and 10, 000 of the higher-end H100 AI cards. Moreover, the company has invested in NVIDIA’s H20 AI startups, and they have a “pool” of resources that are being shared between DeepSeek and High-Flyer for” trading, conclusion, education, and research”. This translates into approximately$ 1.6 billion in CapEx for DeepSeek, with operating costs rumored to be around$ 944 million. The figures are roughly four hundred times higher than what the initial estimates from the markets suggested.

For clarification, the initial figure is said to be a” specific part” of the training costs likely associated with running the final model. The one thing DeepSeek excelled at was tapping local talent, through recruitment events at prestigious local universities, for salaries of more than$ 1.3 million for specific employees. The “misreported” financial figures acted as a catalyst in last week’s black swan event, but the brains behind DeepSeek’s R1 model were indeed capable of coming up with an effective solution to compete with the likes of OpenAI.
You should definitely check out SemiAnalysis’s extensive testing of DeepSeek’s AI model because the details are intriguing.