The Computing Power Shift in the Heat Wave of DeepSeek
Advertisements
Recently, DeepSeek has made waves in the tech world, rapidly climbing to the top of global trending charts with skyrocketing user numbersThis groundbreaking model is poised to become the world's leading AI powerhouse, making significant impacts both domestically in China and internationally, particularly in the United States and the capital marketsThe recent seminar hosted by the Computing Power Network Committee is, therefore, timely and of utmost relevance.
One of the most compelling aspects of this development is the emergence of a leading figure in China's AI landscapeSo, what has DeepSeek done correctly? The answer lies in its ability to deliver optimal performance while balancing speed, efficiency, and cost under resource constraintsWith a clear direction, DeepSeek has pushed the boundaries of optimizationIts core technologies—such as knowledge distillation, mixed expert systems, unsupervised reinforcement learning, multi-head latent attention, and mixed precision computation—aren't groundbreaking in themselves; rather, they represent a sophisticated application of existing algorithms and software toolsHowever, what sets DeepSeek apart is its relentless pursuit of engineering excellence and methodological refinementEssentially, the model has significantly enhanced training and decoding efficiency without compromising performance, achieving an incredible tenfold improvement in just a short period.
When evaluating DeepSeek's capabilities, it becomes clear that it rivals the leading international AI models while showcasing a cost advantageIn many ways, it serves as China's equivalent of an "AI atomic bomb." If we liken the global AI landscape to a series of explosive breakthroughs, the first 'atomic bomb' was ChatGPT, which debuted on November 30, 2022. DeepSeek emerges as the second, scheduled for January 20, 2025, as its functionality closely mirrors that of the leading models from the United States, specifically ChatGPT-4, and even approaching the capabilities of ChatGPT-5. Other models in both domestic and international markets may be considered less impactful by comparison.
The essential value of DeepSeek cannot be understated: it disrupts the existing American monopoly on AI technology, bolstering China’s competitiveness in the global technology arena and facilitating a grand-scale dissemination of AI knowledge within the country
Advertisements
This upheaval in the domestic landscape influences the valuation metrics of several leading AI firms, as their business models must evolve beyond merely selling tokens.
One of DeepSeek's greatest contributions is making AI more user-friendly and accessibleIt initiates a positive feedback loop where enhanced capabilities lead to a greater demand for AI, which in turn drives down costsConsequently, the potential for large-scale applications of large models is now within reachHowever, it's crucial to maintain a realistic perspective amid the excitementFor one, while DeepSeek offers remarkable optimization, it does not signify a true technological revolution; it is an embodiment of cumulative progress rather than a leap from zero to oneAdditionally, the landscape remains characterized by the United States' supremacy, with China firmly in the second positionThis ongoing dynamic shouldn't be overshadowed by fleeting trends suggesting an overwhelming victory for any partyFurthermore, traditional laws of technology—including Moore's Law and scaling laws—remain unaltered.
DeepSeek exemplifies a distinctly Chinese approach that balances rapid innovation with conscious resource management, contrasting starkly with the USA's more laissez-faire attitude towards large model developmentHowever, it would be a mistake to presume that DeepSeek has solidified its business model, as it struggles to accommodate an expansive user base of hundreds of millionsAdditionally, the implications of its open-source versus closed-source controversies will likely have lasting repercussions.
Turning our attention to the ramifications for the computing power industry, one key observation emerges: total demand for computing resources will not decrease but rather continue to growA historical examination of technological innovations reveals that, after cost reductions, demand often escalates, a phenomenon poignantly illustrated by Jevons Paradox.
In evaluating the necessity for smart computing power, it is imperative we recognize AI as a foundational technology that will reshape industry standards across the board, substantially elevating demand
Advertisements
Major financial commitments have already been made, with the US investing $500 billion in its space technology and Europe dedicating €200 billion to bolster its computing capabilitiesAs these initiatives unfold, the restrictions imposed by the US on technology are likely to intensify, creating significant ramifications for China's computing power market landscapeCurrently, the United States produces around 5 million GPUs annually, with approximately 80% remaining stateside, while only about 10% find their way to ChinaGiven the tightening sanctions, this number is expected to diminish further, highlighting a noticeable gap in capabilities.
As we look to the future, understanding the nuances between pre-training, fine-tuning, and inference becomes essential, observing these various dimensions separately due to the distinct supply chains and ecosystems involvedApproaching breakthroughs in AGI and ASI will inevitably require access to premier supercomputing resources, making the establishment of a sovereign foundational model in China non-negotiable, as aptly noted by Zhang Yunquan during a recent discussionChina's innovation pathway must not rely on foreign developments; rather, the nation must forge its unique path forwardTo achieve this, the creation of a 100,000-card supercomputing cluster is a pressing necessity.
Retreating from a commitment to build a 100,000-card supercomputing cluster in the wake of DeepSeek's emergence would be a short-sighted reaction that might hinder China's long-term goalsOnly with such resources can a true foundational model be realized, thus bolstering the nation's commitment to concentrate efforts on significant projects rather than being easily swayed by temporary disruptions.
DeepSeek's remarkable advancements in large model functionalities and drastic cost reductions are poised to unleash an explosive wave of industry applicationsConsequently, our Computing Power Network Committee should actively seek to identify impactful scale-based applications
Advertisements
We must explore the potential for transformative phenomena akin to the popularity of WeChat or Xiaohongshu within the realm of large models.
From an investment standpoint, emerging sectors ripe for breakthroughs should embody three primary characteristics: first, they must present complex challenges that stretch beyond human cognitive capacities; second, leveraging large models must indicate a clear path to improved efficiency and cost savings, ideally doubling returns; and third, entities within these sectors must possess financial resources and payment capabilitiesBased on these criteria, sectors like finance, healthcare, autonomous driving, and robotics appear particularly promising.
Furthermore, a focus on intelligent agents as carriers of breakthroughs across various industry applications seems prudentIn fine-tuning industry models, usability and localization stand as critical barriers.
Lastly, our newly founded Computing Power Freedom platform has emerged as a pivotal transaction service hub, ably supported by the Pengcheng Laboratory and the Computing Power Network CommitteeThe platform has fully integrated DeepSeek, enabling direct applications or app usage across various configurations from 7 billion parameters up to 671 billion, while also exploring the use of domestic heterogeneous power systems.
At this juncture, heightened attention should instead be directed towards specific demands outlined by several government departments in collaboration with state-owned enterprisesThe full-powered model of DeepSeek's 671 billion parameters, paired with the domestic Ascend 910B power system, can meet complex scenario demands while fulfilling innovation and entrepreneurship objectivesOur team is diligently working, and we expect to deliver initial versions within a couple of months.
We invite our peers to collectively consider the ideal applications of DeepSeek within China alongside our Computing Power Network's obligations to provide strategic insights for the government and voice the industry's needs.
Advertisements
Advertisements
Post Comment