Last week, a Chinese AI company called DeepSeek triggered a shock wave of excitement and panic in the artificial-intelligence community after it released its new open-source R1 reasoning model, which by most accounts gives industry leader OpenAI’s o1 a run for its money, only for a lot less money and compute resources. Over the weekend, DeepSeek’s ChatGPT competitor became the most-downloaded app in Apple’s App Store, and on Monday, the disruption hit the stock market. An AI tech sell-off wiped out nearly $600 billion of AI chip-maker Nvidia’s market value, which CNBC says is the biggest one-day loss for a company in U.S. history. The drama got so intense, it even drew President Trump’s attention. Elsewhere, investors, experts, and analysts are still struggling to understand the implications of this shock to the global AI system. Below is some of the commentary thus far.
.
It’s “AI’s Sputnik moment”
That was famed tech investor Marc Andreessen’s widely repeated DeepSeek take over the weekend:
Put another way, as explained by Fortune’s Christiaan Hetzner:
China’s artificial intelligence breakthrough is shaking the foundation of the West’s dominance in this technological arms race, conjuring comparisons to one of the USSR’s greatest achievements. … Not only does it suggest the Communist Party–controlled country has caught up to the United States, but it may also be on the cusp of eclipsing it. R1’s more cost-efficient AI training and inference risks also call into question the thesis underpinning sky-high valuations for most Magnificent Seven stocks. …
The release of OpenAI’s ChatGPT in November 2022 took the world by storm, launching a hype in AI stocks only comparable to the dotcom era. Investors have bid up the prices of companies like Nvidia, Microsoft, and Tesla on the expectation that a select few will be in a position to rake in billions upon billions of dollars in AI-related profits. Whether it was Microsoft partner OpenAI’s GPT, Alphabet’s Gemini, or Claude designed by Amazon-backed Anthropic, it was considered unquestionable that a mere handful of companies had the kind of financial and technological resources to compete in this space. They would then monetize their advantage by charging customers to use their proprietary, closed-source AI models. … Yet R1 suggests that the thesis may be wrong.
At Bloomberg, Joe Weisenthal wonders if this kind of thinking will further elevate the already amorphous AGI as a national security goal:
The arrival of such a powerful model coming out of China has the potential to cement the narrative that the race to achieve “AGI” (whatever that means) is a national project on par with the development of the nuclear bomb. Nevermind the fact that nobody can really articulate what AGI is, or why it’s so important, or what achieving AGI will mean strategically. Everyone is just accepting the notion that there is a global race happening, and that we can’t lose it. This was of course already emerging, even before the DeepSeek hype. And it’s also why Trump and a handful of tech companies are talking about a half a trillion dollar investment in a new entity called Stargate. While the stocks of all these companies may be down today, you can imagine a lot of public money flowing in their direction in the years to come in order to ‘win’ this race.
Weisenthal also warns against these types of too-cute narratives, in general:
I’m really not a fan of all the “just so” stories that people are telling about how capitalism and markets work in China and the US. The public narrative is that DeepSeek was created as a side project of a Chinese quant firm. And the allure of this story is that the Chinese system doesn’t reward financialization as much, so the creators had to build something substantive. Maybe that story is even true, but I think people should be on guard for narratives that are too satisfying. I think the popular stories about how American manufacturing giants (like Boeing and Intel) stumbled because they were too shareholder-focused are a little unsatisfying as well.
.
It’s a bias-confirming Rorschach test for AI observers
In his must-read explainer on the DeepSeek drama, Platformer’s Casey Newton surveys the various reactions to the AI company’s achievement:
As news of DeepSeek’s achievement spread over the weekend, it became a kind of Rorschach test. While everyone is impressed that DeepSeek built the best open-weights model available for a fraction of the money that its rivals did, opinions about its long-term significance are all over the map.
To many prominent voices in AI, DeepSeek seems to have confirmed what they already believed. To AI skeptics, who believe that AI costs are so high that they will never be recouped, DeepSeek’s success is evidence of Silicon Valley waste and hubris. To AI bulls, who think America needs to build artificial general intelligence before anyone else as a matter of national security, DeepSeek is a dire warning to move faster. And to AI safety researchers, who have long feared that framing AI as a race would increase the risk of out-of-control AI systems doing catastrophic harm, DeepSeek is the nightmare that they have been waiting for.
Whatever the truth is won’t be known for some time. Reading the coverage over the past few days, and talking with folks who work in the industry, I’m convinced that DeepSeek is a huge story deserving of our ongoing attention. At the same time, I’m not sure that the emergence of a powerful, low-cost Chinese AI model changes the dynamics of competition quite as much as some observers are saying.
Read the rest of Newton’s overview here.
.
It’s important, but we’re also seeing a premature freak-out
That’s the gist of Emanuel Maiberg’s relatively chill take at 404 Media:
The main reason people are excited/scared/throwing up right now is that DeepSeek was developed and released under America’s export restrictions that prevent Chinese companies from getting the latest and most powerful Nvidia chips. As Wired explained, DeepSeek was spun out from High-Flyer, a Chinese hedge fund that originally acquired GPUs to analyze financial data, before it invested its money and resources in developing AI. That a new player in this space was able to build an AI model without access to the latest and greatest Nvidia chips (though people in China have found ways to obtain them despite restriction), using new, more efficient reinforcement learning strategies, has undermined the idea that companies like Nvidia or OpenAI have built a “moat” around their companies that will secure their lead in the AI race forever, and, by extension, has undermined the notion of American AI world supremacy. It also at least raises the possibility that a Chinese company has found a better, more efficient, and cheaper way to train AI models than any American company has discovered thus far.
As others have pointed out, it’s hard to say exactly what DeepSeek actually spent to make its model without trusting it blindly. The true cost may be hidden in ways we don’t understand, and is definitely benefiting by building on top of the very expensive research (primarily from American companies) that came before it. But if AI companies can build competitive models at a fraction of the cost on a comparatively tiny number of lesser GPUs, then much of Nvidia’s value and the billions of dollars AI companies are burning on training suddenly seems excessive and wasteful (even to AI boosters), hence the stock tumbling
Does this mean Nvidia, OpenAI, and other AI companies are doomed? Again, this is not financial advice but the market appears to be spasming based on vibes, and definitely before we have a great understanding of DeepSeek’s impact. The most obvious rebuttal from Nvidia bag holders in this situation is that DeepSeek’s newfound efficiencies will only benefit AI incumbents. If these new methods give DeepSeek great results with limited compute, the same methods will give OpenAI and other, more well-resourced AI companies even greater results on their huge training clusters, and it is possible that American companies will adapt to these new methods very quickly. Even if scaling laws really have hit the ceiling and giant training clusters don’t need to be that giant, there’s no reason I can see why other companies can’t be competitive under this new paradigm. We should also probably hope that this is the case since it could lower the environmental impact of AI.
Maiberg also notes that “this type of leapfrogging seems totally normal, and we have seen variations of it over the last couple of years”:
People love to prematurely dance on OpenAI’s grave whenever a new and shiny model is released. Meta’s Llama, France’s Mistral, and Anthropic’s Claude have all seemed like they’re getting ahead at one point or another and are favored by different users for different uses, only for another model to be released by OpenAI or another company that leapfrogs the hot new technology and makes them seem old.
Casey Newton made a similar point:
Everyone basically already assumed that all of this was going to happen. By “all of this,” I mean that (1) open-source companies would reverse-engineer everything the big labs are doing and (2) that costs for AI training and inference would decline dramatically over time. …
Anyone who has sent the same query to ChatGPT, Claude, and Gemini on the same day has known for more than a year that you can get basically as good an answer from any of them. And anyone who has used Llama has known for more than a year that the open-weights version that arrives later is only slightly worse.
Right now a lot of investors are catching up to these basic facts at the same time, and stock prices are falling accordingly. But it’s not clear to me that much of it was really news to the AI labs and tech platforms.
.
It’s a win for almost everyone (and an opportunity for the U.S. to be less anti-competitive)
In his big FAQ on the DeepSeek news, Stratechery’s Ben Thompson explains why he thinks the innovation benefits everybody — as well as why he thinks it underlines the shortsightedness of restricting China’s access to U.S. chips:
I think that DeepSeek has provided a massive gift to nearly everyone. The biggest winners are consumers and businesses who can anticipate a future of effectively-free AI products and services. Jevons Paradox will rule the day in the long run, and everyone who uses AI will be the biggest winners. Another set of winners are the big consumer tech companies. A world of free AI is a world where product and distribution matters most, and those companies already won that game; The End of the Beginning was right. China is also a big winner, in ways that I suspect will only become apparent over time. Not only does the country have access to DeepSeek, but I suspect that DeepSeek’s relative success to America’s leading AI labs will result in a further unleashing of Chinese innovation as they realize they can compete.
That leaves America, and a choice we have to make. We could, for very logical reasons, double down on defensive measures, like massively expanding the chip ban and imposing a permission-based regulatory regime on chips and semiconductor equipment that mirrors the E.U.’s approach to tech; alternatively, we could realize that we have real competition, and actually give ourself permission to compete. Stop wringing our hands, stop campaigning for regulations — indeed, go the other way, and cut out all of the cruft in our companies that has nothing to do with winning. If we choose to compete we can still win, and, if we do, we will have a Chinese company to thank.
.
It doesn’t mean AI chip export controls are a failure or should be abandoned
Lennart Heim and Sihao Huang argue that it’s too early to gauge the full impact of the chip export controls targeting China, and that nobody should be surprised by compute efficiency gains like this:
The reality of increasing compute efficiency means AI capabilities will inevitably diffuse. Controls alone aren’t enough: they must be paired with actions to strengthen societal resilience and defense: creating institutions to identify, assess, and address AI risks and building robust defenses against potentially harmful AI applications from adversaries. However, we should also recognize that export controls already impact Chinese AI development and could have even stronger effects in the future.[8] While AI capabilities will likely diffuse regardless of controls—and it will always be difficult for export controls or other “capability interventions” to completely prevent proliferation—they remain important for maintaining our technological advantages. Controls buy valuable time, but need to be complemented with policies that ensure democracies stay in the lead and are resilient to adversaries.
ChinaTalk’s Jordan Schneider spoke with former OpenAI policy researcher Miles Brundage about DeepSeek’s achievements. Brundage emphasized that DeepSeek doesn’t want to do more with less, and that U.S. AI firms still have the advantage:
[I]f it’s better to have fewer chips, then why don’t we just take away all the American companies’ chips? Clearly there’s a logical problem there.
Certainly there’s a lot you can do to squeeze more intelligence juice out of chips, and DeepSeek was forced through necessity to find some of those techniques maybe faster than American companies might have. But that doesn’t mean they wouldn’t benefit from having much more. That doesn’t mean they are able to immediately jump from o1 to o3 or o5 the way OpenAI was able to do, because they have a much larger fleet of chips.
People are reading too much into the fact that this is an early step of a new paradigm, rather than the end of the paradigm. These are the first reasoning models that work. This is the first demonstration of reinforcement learning in order to induce reasoning that works, but that doesn’t mean it’s the end of the road. I think everyone would much prefer to have more compute for training, running more experiments, sampling from a model more times, and doing kind of fancy ways of building agents that, you know, correct each other and debate things and vote on the right answer. So there are all sorts of ways of turning compute into better performance, and American companies are currently in a better position to do that because of their greater volume and quantity of chips.
.
It could be an “extinction-level event” for some VCs
At Axios, Dan Primack suggests that DeepSeek “could be an extinction-level event for venture capital firms that went all-in on foundational model companies”:
Particularly if those companies haven’t yet productized with wide distribution. The quantums of capital are just so much more than anything VC has ever before disbursed, based on what might be a suddenly-stale thesis. If nanotech and web3 were venture industry grenades, this could be a nuclear bomb. Investors I spoke to over the weekend aren’t panicking, but they’re clearly concerned. Particularly that they could be taken so off-guard. Don’t be surprised if some deals in process get paused.
.
It might just be the AI bubble bursting a bit
At The Wall Street Journal, James Mackintosh wonders, “How much was the selloff about fundamentals, and how much about sentiment?”
The moves in prices appear to show investors focused on fundamental issues of how DeepSeek’s approach will lead to lower power use and less demand for chips and data centers.
Yet, it is hard to believe that prices had run up so much purely on the back of smart investors plugging growth estimates into their spreadsheets and valuing the resulting cash flows. A lot of what’s been going on is similar to when investors discovered the internet. They have grasped that AI is A Big Deal, but can’t yet see exactly how or when it will make money.
In a sentiment-driven market, it is even harder to work out what happens next. I thought the market was overly frothy in mid-December, because prices seemed too far detached from reality. The trouble is that sentiment is hard to predict: Investors can always become even more excited about something, but sentiment becomes more vulnerable to a setback the more enthusiastic investors are. DeepSeek may be just that setback.
He adds:
More competition will make it hard for Big Tech to make the oligopoly-like profit margins that investors hope for. If the companies can’t make fat profits, it will be even harder to justify their high valuations. These valuations, remember, rely on the assumption that AI tools will be both widely used and highly profitable, but even the experts have little explanation of how the business model will work. It will also be harder to explain why they are sinking so much money into AI data centers.