Trending

Always-Listening AI Wearables Are Here: Transforming Lives or Invading Privacy?

The impact of AGI on businesses and how to apply it?

LinkedIn Faces Lawsuit Over Customer Data Used in AI Training

Table of Contents

DeepSeek V3: An Open-Source AI Model Challenges Industry Leaders

Read Time: 2 minutes

Table of Contents

DeepSeek’s new AI model appears to be one of the best ‘open’ challengers yet.

DeepSeek, a Chinese artificial intelligence company, has launched DeepSeek V3, establishing itself as one of the most formidable open-source AI models currently available. The model comes with a flexible license that enables developers to freely download, customize, and implement it across various applications, including business ventures.

DeepSeek V3 demonstrates exceptional capabilities in processing text-based operations, encompassing programming, language conversion, and creative content generation, such as composing documents or correspondence from detailed instructions. Internal testing by DeepSeek indicates that the model surpasses existing open-source alternatives and proprietary models, including Meta’s Llama 3.1 405B, OpenAI’s GPT-4o, and Alibaba’s Qwen 2.5 72B, particularly in programming challenges on Codeforces. It also dominates in Aider Polyglot evaluations, which assess a model’s proficiency in generating code that seamlessly integrates with existing codebases.

The model underwent training on an extensive dataset comprising 14.8 trillion tokens, featuring 671 billion parameters (685 billion on Hugging Face). To provide perspective, Llama 3.1 405B, with its 405 billion parameters, is roughly 1.6 times more compact than DeepSeek V3. While increased parameters typically yield enhanced performance, they also necessitate more robust computational resources, with DeepSeek V3 requiring premium GPU hardware for optimal operation.

Remarkably, DeepSeek completed the model’s training within two months using Nvidia H800 GPUs—hardware now subject to U.S. Department of Commerce restrictions for Chinese buyers. The company reports spending only $5.5 million on DeepSeek V3’s training, significantly less than competitors like OpenAI’s GPT-4.

Nevertheless, DeepSeek V3 faces certain constraints. The model’s ideological framework reflects China’s internet governance policies. For example, it refrains from addressing politically sensitive subjects like Tiananmen Square, adhering to mandatory “core socialist values” as per Chinese regulations.

DeepSeek, supported by High-Flyer Capital Management, a Chinese quantitative investment firm, strives to advance AI technology, targeting “superintelligent” AI development through its organization. Despite operational restrictions and political parameters, DeepSeek V3 marks a significant milestone in open-source AI development, providing a robust and economical alternative to current industry frontrunners.

Get Instant Domain Overview
Discover your competitors‘ strengths and leverage them to achieve your own success