DeepSeek, a Chinese artificial intelligence company, has launched DeepSeek V3, establishing itself as one of the most formidable open-source AI models currently available. The model comes with a flexible license that enables developers to freely download, customize, and implement it across various applications, including business ventures.
DeepSeek V3 demonstrates exceptional capabilities in processing text-based operations, encompassing programming, language conversion, and creative content generation, such as composing documents or correspondence from detailed instructions. Internal testing by DeepSeek indicates that the model surpasses existing open-source alternatives and proprietary models, including Meta’s Llama 3.1 405B, OpenAI’s GPT-4o, and Alibaba’s Qwen 2.5 72B, particularly in programming challenges on Codeforces. It also dominates in Aider Polyglot evaluations, which assess a model’s proficiency in generating code that seamlessly integrates with existing codebases.
The model underwent training on an extensive dataset comprising 14.8 trillion tokens, featuring 671 billion parameters (685 billion on Hugging Face). To provide perspective, Llama 3.1 405B, with its 405 billion parameters, is roughly 1.6 times more compact than DeepSeek V3. While increased parameters typically yield enhanced performance, they also necessitate more robust computational resources, with DeepSeek V3 requiring premium GPU hardware for optimal operation.
Remarkably, DeepSeek completed the model’s training within two months using Nvidia H800 GPUs—hardware now subject to U.S. Department of Commerce restrictions for Chinese buyers. The company reports spending only $5.5 million on DeepSeek V3’s training, significantly less than competitors like OpenAI’s GPT-4.
Nevertheless, DeepSeek V3 faces certain constraints. The model’s ideological framework reflects China’s internet governance policies. For example, it refrains from addressing politically sensitive subjects like Tiananmen Square, adhering to mandatory “core socialist values” as per Chinese regulations.
DeepSeek, supported by High-Flyer Capital Management, a Chinese quantitative investment firm, strives to advance AI technology, targeting “superintelligent” AI development through its organization. Despite operational restrictions and political parameters, DeepSeek V3 marks a significant milestone in open-source AI development, providing a robust and economical alternative to current industry frontrunners.