Data wars, fighting VCs, and how Meta is disrupting the AI model market
Welcome to Cautious Optimism, a newsletter on tech, business, and power. Modestly upbeat.
📈 Trending Up: Nipah infections … stimulus … Revolut … stablecoins … cheaper batteries … genAI funding … censored models … Mistral … US GDP growth …
📉 Trending Down: Southwest’s ability to be different … Carta’s ability to hold onto execs … stocks … useful apologies … the planet …
🤔 What Else?
The war for AI data is heating up: It appears that Google now has quasi-exclusive access to crawling Reddit, a critical source of search results and AI training data. Microsoft confirmed that it cannot crawl Reddit after the social media company changed its Robots.txt file. Other non-Google search engines have similar problems.
Reddit says the change is “not at all related to our recent partnership with Google,” SearchEngineLand reports, with the company adding that it has been “in discussions with multiple search engines” but has been “unable to reach agreements with all of them, since some are unable or unwilling to make enforceable promises regarding their use of Reddit content, including their use for AI.”
Instead of Google buying unique access, other search engines are simply unwilling to pay for access to Reddit. Yet.
Elsewhere, complaints about Anthropic hammering websites with traffic to hoover up their information are bouncing around the Internet. It's not a shock, then, that Cloudflare has new anti-AI scraping tools in the market, and TollBit and Human Native are building businesses around AI data licensing.
I will never leave Twitter
To summarize: Startup investor David Sacks argued that President Biden’s decision to not seek reelection after folks said he was too old to seek reelection was, in fact, a coup. And more of a coup than the insurrection on January 6 several years ago in which an armed mob broke their way into the capitol and tried to overturn an election by force.
Rippling CEO Parker Conrad then weighed in that Sacks knows a lot about coups. Because, in his view, Sacks executed one to depose Conrad from the CEO role at Zenefits, where Sacks had been brought in as COO to clean up a regulatory mess before becoming CEO himself. Sacks then fired back that Conrad is a low-ethics clown emoji. (He also called him a “whiny little bitch” in a later tweet.)
But wait! There’s more! Paul Graham of Y Combinator fame then jumped into the mix and threatened to air dirty laundry, calling Sacks non-founder-friendly and evil. Sacks responded with insinuations that Graham is an anti-semite.
People forget that while the venture capital world wants to present itself as a sober collection of intelligent folks trying to back the future, it’s also full of humans with egos and decades of doing business with and against one another.
Meta’s new AI model could break for-profit hearts
Daggers. That’s what The Information dropped this week. Two of them:
That’s an impressive double-click of reporting on two of the hottest companies in tech today. And indication that even more capital will be needed to keep their engines running. Recall how wild it felt when Microsoft put up $10 billion for OpenAI? Little did we know that that figure would wind up being too small.
The news that leading foundation model companies are torching hundreds of millions of dollars every month could not have come at a worse time. Earlier this week Meta announced a new quasi-open-source AI model in its Llama collection. The model, Llama 3.1 405b, is very good and you can download it and use it for free.
The debate about open-source AI versus closed-source AI may be decided by market forces instead of regulation. How much more money would you put into Anthropic or OpenAI if a major tech company was doing very similar work, potentially better, and offering it up for free?
Another billion? Another ten billion? Building foundation genAI models is incredibly expensive, and Meta can afford to subsidize its own work. That means that anyone who wants to pursue a for-profit version what Meta is cooking up will have to have a much better model on offer, or business services surrounding their models that make them worth the cost compared to free offerings. That’s a tough way to win a market.
I wouldn’t write Anthropic or OpenAI off, but their spend and results contrasted to what you can get in the market for free today is something to chew on when we consider their future.