The entire comparison hinges on people only making simple factual searches ("what is the capital of USA") on both search engines and LLMs. I'm going to say that's far enough from the standard use case for both these sets of APIs to be entirely meaningless.
- If I'm using a search engine, I want to search the web. Yes these engines are increasingly providing answers rather than just search results, but that's a UI/product feature rather than an API one. If I'm paying Google $$ for access to their index, I'm interested in the index.
- If I'm using an LLM, it is for parsing large amounts of input data, image recognition, complex analysis, deep thinking/reasoning, coding. All of these result in significantly more token usage than a 2-line "the answer to your question is xyz" response.
The author is basically saying – a Honda Civic is cheap because it costs about the same per pound as Honeycrisp apples.
I think the issue is that the classical search engine model has increasingly become less useful.
There's less experts using search engines. Normal people treat search engines less like an index search and more like a person. Asking an old school search engine "What is the capital of USA" is actually not quite right, because the "what is" is probably quite superfluous, and you're counting on finding some sort of educative website with the answer. In fact phrasing it as "the capital of the USA is" is probably a better fit for a search engine, since that's the sort of sentence that would contain what you want to know.
Also with the plague of "SEO", there's a million sites trying to convince Google that their site is relevant even when it's not.
So LLMs are increasingly more and more relevant at informally phrased queries that don't actually contain relevant key words, and they're also much more useful in that they bypass a lot of pointless verbiage, spam, ads and requests to subscribe.
Most search engines will parse the query sentence much more intelligently than that. It's not literally matching every word and hasn't for decades. I just tried a handful of popular search engines, they all return the appropriate responses and links.
They're not that literal anymore of course, but they still don't compare to an LLM. In the end it's still mostly searching for key words even if with a few tweaks here and there, and the ability to answer vague questions mostly works by finding forums and Reddit posts where people ask that specific question and hopefully get an answer.
When you're asking a standard question like the capital of whatever, that works great.
When you have one of those weird issues, it often lands you in a thread somewhere in the Ubuntu forums where people tried to help this person, nothing worked, and the thread died 3 years ago.
Just the fact that LLMs can translate between languages already adds an amazing amount of usefulness that search engines can't have. There seems to be a fair amount of obscure technical info that's only available in Russian for some reason.
Meanwhile, I'm increasingly frustrated by my inability to find a service where I can search for keywords I want and keywords I don't want, and reliably check the offered links, ctrl-f for the wanted keywords and find them, and ctrl-f for the unwanted keywords and fail to find them. Oh, and apparently I can completely forget about search that cares about non-alphanumeric characters whatsoever.
This is a great point. I'll add that search engines are also unclear about what kind of output they give. As you point out, search engines accept both questions and key words as queries. Arguably you'd want completely different searches/answers for those. Moreover, search engines no longer just output web sites with the key words but also give an "AI overview" in an attempt to keep you on their site, which is contrary to what search engines have traditionally done. Previously search engines were something you pass through but they now try to position themselves as destinations instead.
I'd argue that search engines should stick to just outputting relevant websites and let LLMs give you an overview. Both technologies are complimentary and fulfill different roles.
> The entire comparison hinges on people only making simple factual searches ... on both search engines and LLMs.
I disagree, but I can see why someone might say this, because the article's author writes:
> So let's compare LLMs to web search. I'm choosing search as the comparison since it's in the same vicinity and since it's something everyone uses and nobody pays for, not because I'm suggesting that ungrounded generative AI is a good substitute for search.
Still, the article's analysis of "is an LLM API subsidized or not?" does not _rely_ on a comparison with search engines. The fundamental analysis is straightforward: comparing {price versus cost} per unit (of something). The goal is figure out the marginal gain/loss per unit. For an LLM, the unit is often a token or an API call.
Summary: the comparison against search engine costs is not required to assess if an LLM APIs is subsidized or not.
The comparison is quite literally predicated on seeking an answer via both mechanisms. And the simple truth is that for an enormous percentage of users, that is indeed precisely how they use both search engines and LLMs: They want an answer to a question, maybe with some follow-up links so if that isn't satisfactory they can use heuristics to dig deeper.
Which is precisely why Google started adding their AI "answers". The web has kind of become a cancer -- the sites that game SEO the most seem to have the trashiest, most user-hostile behaviour, so search became unpleasant for most -- so Google just replaces the outbound visit conceptually.
>The entire comparison hinges on people only making simple factual searches
You have a point but no it doesn't. The article already kind of addresses it, but Open AI had a pretty low loss in 2024 for the volume of usage they get. 5B seems like a lot until you realize chatgpt.com alone even in 2024 was one of the most visited sites on the planet each month with the vast majority of those visits being entirely free users (no ads, nothing). Open AI in December last year said chatgpt had over a billion messages per day.
So even if you look at what people do with the service as a whole in general, inference really doesn't seem that costly.
I'll definitely buy that argument for OpenAI, but then why are Anthropic/XAI etc losing money? They don't have the same generous free tiers as OpenAI and yet they keep raising absurd amounts of money.
I mean I would still expect them to currently lose money ? Their tiers aren't as generous but they're still free free (i.e no revenue generation whatsoever, google search is free but they're still generating revenue per user via ads and such).
I think the authors point isn't that inference is so cheap that they can be profitable without changing anything but that inference is now cheap enough for say ads (however that might be implemented for an LLM provider) to be a viable business model. It's an important distinction because a lot of people still think LLMs are so expensive that subscriptions are the only way profit can be made.
> Their tiers aren't as generous but they're still free free
Certainly Claude's free tier is not generous, I basically ended up subscribing the first day I used it.
But, assuming that the losses are from the free tier, it's odd to me that Anthropic wouldn't be showing some kind of cash generation at this point.
Granted training is super expensive and they're hiring loads of people ahead of revenue, but if they were unit-cost profitable, one would have expected this to be leaked during one of (the many) funding rounds they've engaged in.
I'm mostly unconvinced by the author's analysis because of the above, but it's certainly food for thought to shift my prior that LLM modelling and service providing is a bad business.
> If I'm using an LLM, it is for parsing large amounts of input data, image recognition, complex analysis, deep thinking/reasoning, coding. All of these result in significantly more token usage than a 2-line "the answer to your question is xyz" response.
Correct, but you're also not the median user. You're a power user.
>If I'm using a search engine, I want to search the web. Yes these engines are increasingly providing answers rather than just search results, but that's a UI/product feature rather than an API one.
This is a great point, lets hold onto that.
>If I'm using an LLM, it is for parsing large amounts of input data, image recognition, complex analysis, deep thinking/reasoning, coding.
Strongly disagree. Sometimes when googling its not clear what links if any will have the information you are looking for. And of course, you dont know if this will be the case before searching.
First, you can just use an LLM to cut out a lot of the fat in search results. It gives you a direct answer and even a link.
But let's assume they couldnt source their claims. Even still, sometimes its quicker to search a positive "fact" instead of a open-ended question/topic.
In this case if you want a direct source showing something you can query an LLM, get the confidently-maybe-correct response, then search that "fact" in Google to validate.
I understand the idea that "if im googleing I want the index" but there is a reason google is increasingly burying their search results. People increasingly do _not_ want the index because it's increasingly not helpful. Ultimately it is there to surface information you are looking for.
Anecdotally, I'm a paying user and do a lot of super basic queries. What is this bug, rewrite this drivel into an email to my HOA, turn me into a gnome, what is the worst state and why is it west Virginia.
This would probably increase 10x if one of the providers sold a family plan and my kids got paid access.
Most of my heavy lifting is work related and goes through my employer's pockets.
Careful there: Once the machine turns you into a gnome, the price to turn back is quite hefty. A friend of mine gave up an eye, I only lost my most cherished memory. And most people ask the wrong question entirely and are never heard from again.
- If I'm using a search engine, I want to search the web. Yes these engines are increasingly providing answers rather than just search results, but that's a UI/product feature rather than an API one. If I'm paying Google $$ for access to their index, I'm interested in the index.
- If I'm using an LLM, it is for parsing large amounts of input data, image recognition, complex analysis, deep thinking/reasoning, coding. All of these result in significantly more token usage than a 2-line "the answer to your question is xyz" response.
The author is basically saying – a Honda Civic is cheap because it costs about the same per pound as Honeycrisp apples.