Nvidia and the AI scaling wall, and what founder mode really is
Welcome to Cautious Optimism, a newsletter on tech, business, and power.
📈 Trending Up: Snowflake, up 25% after earnings … wrapping yourself in a flag … quantum computing … self-driving in China … chances of nuclear war … Lighthouse, now a unicorn … EV boats …
📉 Trending Down: Palo Alto Networks, down 4% after earnings … revenue at Baidu … robotaxi costs … Chinese manufacturing … jobless claims …
Key details from Nvidia’s earnings report
Yesterday, after the bell, Nvidia reported its third quarter, fiscal 2025. That’s the three months ending October 27, 2024 in case you wondered.
Nvidia beat revenue ($35.1 billion versus $33.2 billion projected), and earnings per share expectations ($0.81 versus $0.74 projected). And with enough of a guide to keep investors happy for the rest of the year, we can dig into Nvidia’s earnings call to get a better grip on the slightly-further-out-future.
To get into the details, let’s start with Nvidia CFO Colette Kress from the company’s earnings call. The first thing we need to now is how much progress the chip company has made on Blackwell, it’s replacement and improvement on its Hopper-series of GPUs that have been the gold-standard for a minute now (transcript source):
Total revenue [in Q4 F2025] is expected to be $37.5 billion, plus or minus 2%, which incorporates continued demand for Hopper architecture and the initial ramp of our Blackwell products. While demand greatly exceed supply, we are on track to exceed our previous Blackwell revenue estimate of several billion dollars as our visibility into supply continues to increase. […]
Blackwell demand is staggering, and we are racing to scale supply to meet the incredible demand customers are placing on us.
Customers are gearing up to deploy Blackwell at scale. Oracle announced the world's first Zettascale AI cloud computing clusters that can scale to over 131,000 Blackwell GPUs to help enterprises train and deploy some of the most demanding next-generation AI models. Yesterday, Microsoft announced they will be the first CSP to offer, in private preview, Blackwell-based cloud instances powered by NVIDIA GB200 and Quantum InfiniBand. Last week, Blackwell made its debut on the most recent round of MLPerf training results, sweeping the per GPU benchmarks and delivering a 2.2x leap in performance over Hopper.
To sum, production is ramping up, demand is huge and better than expected, and major cloud providers are salivating at the chance to get the new, better chips into their mega-scale compute clusters.
But honestly, that’s about as surprising as rain ruining my morning coffee run. Let’s dig in a bit more. Here’s Kress on Nvidia’s robotics-sourced revenues (transcript source):
Industrial AI and robotics are accelerating. This is triggered by breakthroughs in physical AI, foundation models that understand the physical world, like NVIDIA NeMo for enterprise AI agents. We built NVIDIA Omniverse for developers to build, train, and operate industrial AI and robotics.
Figure recently got more of its robots working in a BMW factory, so there may be real juice in the robotics game for Nvidia. If so, the industry could unlock another mass-scale sales channel for AI-ready hardware.
Nvidia is not setting a few bets, however. It’s also scaling massively in India, for example. Here’s Kress again (transcript source):
India's leading CSPs include product communications and data services are building AI factories for tens of thousands of NVIDIA GPUs. By year-end, they will have boosted NVIDIA GPU deployments in the country by nearly 10x.
And Nvidia is making bank off selling ethernet tooling for datacenters, which scaled by 3x in the last year.
What mattered most, however, from the earnings call was how Nvidia CEO Jensen Huang answered a question about AI model improvements. Is there a wall? Have foundation model companies run into it (transcript source)?
Analyst: I guess just a question for you on the debate around whether scaling for large language models have stalled. […] How are you helping your customers as they work through these issues? And then obviously, part of the context here is we're discussing clusters that have yet to benefit from Blackwell. So, is this driving even greater demand for Blackwell?
Huang: Our foundation model pretraining scaling is intact, and it's continuing. As you know, this is an empirical law, not a fundamental physical law. But the evidence is that it continues to scale. What we're learning, however, is that it's not enough, that we've now discovered two other ways to scale.
One is post-training scaling. Of course, the first generation of post-training was reinforcement learning human feedback, but now we have reinforcement learning AI feedback, and all forms of synthetic data generated data that assists in post-training scaling. And one of the biggest events and one of the most exciting developments is Strawberry, ChatGPT o1, OpenAI's o1, which does inference time scaling, what is called test time scaling. The longer it thinks, the better and higher-quality answer it produces.
And it considers approaches like chain of thought and multi-path planning and all kinds of techniques necessary to reflect and so on and so forth. And it's -- intuitively, it's a little bit like us doing thinking in our head before we answer your question. And so, we now have three ways of scaling, and we're seeing all three ways of scaling. And as a result of that, the demand for our infrastructure is really great.
You see now that at the tail end of the last generation of foundation models were at about 100,000 Hoppers. The next generation starts at 100,000 Blackwells. And so, that kind of gives you a sense of where the industry is moving with respect to pretraining scaling, post-training scaling, and then now very importantly, inference time scaling. And so, the demand is really great for all of those reasons.
But remember, simultaneously, we're seeing inference really starting to scale up for our company. We are the largest inference platform in the world today because our installed base is so large. And everything that was trained on Amperes and Hoppers inference incredibly on Amperes and Hoppers. And as we move to Blackwells for training foundation models, it leads behind it a large installed base of extraordinary infrastructure for inference.
And so, we're seeing inference demand go up. We're seeing inference time scaling go up. We see the number of AI native companies continue to grow. And of course, we're starting to see enterprise adoption of agentic AI really is the latest rage.
And so, we're seeing a lot of demand coming from a lot of different places.
That’s bullish enough for me to not worry for at least another year that we’re going to get regular updates to foundation models, and regular improvements, too. Further out than that? Billie said it best: The world’s a little blurry.
Founder mode, from a CEO in the room where it happened
I recently had Spenser Skates, co-founder and CEO of Amplitude, and James Evans, co-founder of CommandAI on the show. Amplitude recently purchased CommandAI, and ads I am both familiar with the purchaser (digital product analytics) and its CEO, I wanted to learn more about the deal.
But while I learned a lot about dealmaking in the present climate, I instead want to share a portion of the chat that gets at the heart of what ‘founder mode’ is. After Paul Graham’s essay went in the wake of a talk that Airbnb’s Brian Chesky held for Y Combinator, it seemed that founder mode meant everything, to everybody. Nailing it down more precisely the ‘magic’ thereof has been a little tricky.
Enter Skates, discussing his company’s purchase of CommandAI, and the following back and forth on what founder mode is, and is not (transcript slightly tidied for readability):
Skates: I think a lot about ‘how do we get great product and engineering leaders here at Amplitude to own bigger parts of the stack and of the platform and of the future things that we're building.’
Alex: That's very interesting because there was a meme — and I say that very playfully — in the startup world in the last six months about ‘founder mode,’ and instead of federating out responsibility to individual leaders at a company who were subsidiary to the CEO — the founder — instead, constraining that authority and decision making, to some degree, inside the singular executive. But it sounds Spencer, like you're comfortable with having some distributed leadership and perhaps more than the founder mode meme said was correct?
Skates: I want to give a little more [nuance]. I was actually at Brian's talk, and it was crazy. He had basically just come out the other side of a religious experience and to hear that directly — the level of passion and intensity and religiousness that he had about what it meant to run a great organization[?]
He had this great thing where it's like, ‘yeah, it takes 12 years to make a good CEO.’ Because he felt he was a bad one for the first 12 years of Airbnb.
[Founder mode is] actually not either of the scenarios you talked about, Alex. It's not delegation and it's not ‘hey, I'm deciding all the stuff myself.’ The point Brian made was that you are in the details of what is [the] critical path for your company.
And so for Airbnb, that's a great product experience. For Amplitude, that is great product as well as ‘how do you sell this product,’ and you can't delegate those. What I'm looking for is fantastic leaders who I can partner with in those details.
And so as an example, one of the things I've had for the last year is [that] I do a two hour a week product review of ‘how do we make the [Amplitude] platform easier to use?’
What I'm looking for in that meeting is ‘who are folks that I can go back and forth with as peers to say, hey, what does a great experience look like?’ And it's not just me saying, ‘hey, do this, do this, do this, do this.’ It's saying, ‘okay, hey, here's what I think.’ And then they'll say, ‘hey, here's what I think.’
And it's that back and forth that leads to [a] combined great outcome. So it's not a delegation, but it's also not a micromanagement. It's about ‘I want to get people at Amplitude that I can have that interaction with.’ And that's what we're always at a deficit for.
Alex: Okay, that makes a lot of sense to me.
Skates: The key point [on] founder mode is [that] you have to be in the details for whatever is important [to] your company. You cannot delegate that. That's not to say you don't work with other people. You absolutely do. And that's not to say you micromanage them, but you're working with peers in the trenches with them.
Alex: That’s the opposite of what Boeing did.
And there you have it. Craft, in other words. You have be in the details of the craft. Notably, that’s where a ‘software tastes like chicken’ approach probably never builds something truly great.
Bonus: Startup watch 🏎️
Cyera: Cyera just raised $300 million. Again. I spoke with the company back in August, but had no idea at the time that the data security startup was going to raise another nine-figures so soon after its last investment. Cyera is now worth $3 billion, and I’m trying to get the CEO back on the podcast to press him on why more capital, and why now.
xAI: The WSJ reports that Musk’s AI company, xAI, is telling people it closed a new, $5 billion round that pushes its valuation up to $50 billion. Even more, the Journal says that xAI “recently told investors its revenue has reached $100 million on an annualized basis.”
A $100 million run rate is impressive — but I wonder what fraction of that sum might come from Grok fees paid by X. As X and xAI share some investors, I doubt that there’s any hiding of the puck. Still, precisely who will become an xAI customer over, say, an Anthropic or OpenAI customer, I don’t think has been sorted out.
That said, some people burn grudges for long-last fuel. And since Musk is still incensed with OpenAI, I presume that he’ll run xAI all the way to dominance or failure. He certainly has the capital.