Guess what? AI is expensive.

11,064 Views | 139 Replies | Last: 19 min ago by JB99
infinity ag
How long do you want to ignore this user?
JB99 said:

Here's what happened at my company.

- rolled out claude code. Set up a slack channel with a bot. Request API key and you get an email with a key and a MOP. It's billed completley usage based but they initially have no mechanism to do charge backs.

- initially no one has a token budget. 90% of people don't know what claude code is or how to use it.

- over a few months a percent of people figure it out and token usage explodes.

- charge backs implemented, token budgets implemented

Imo it is very useful. It is making sw dev super cheap. However, the number of people building stuff is exploding and there's no judgement being applied to whether project x is worth pursuing. So reigning it in a bit It is the right move. Give people enough they can learn it and discover the potential. Then have a process for approving larger token budgets for legit projects that are vetted.

Definitely being rolled out haphazardly, but companies are figuring it out. Also open weight models will catch up which will make it super cheap.


Ultimately nothing will be cheaper or free. It will just take some time to get there.


One possible example.
  • Let's say personnel costs are $1 Million.
  • Company gets AI which costs $250k and fires 75% of the people. Costs are just $500k now. CEO gets a bonus for cutting costs.
  • AI providers like Anthropic, Google etc are actually the ones who are taking the cost hit and once clients are addicted, they jack up the prices from $250k to $750k. Just a few providers in the market, so clients have no choice but to pay.
  • Costs are now $1.25M. More than what it was earlier.
  • The company fires more people to cut costs. Quality suffers, CEO gets fired.
  • Customers leave, company goes bankrupt.
In other situations, if the company is lucky, AI gets paid the same as what people used to be paid, so no cost savings. Quality is also iffy as you need people to make sure AI is doing the work correctly.

We will reach these situations in some time. Will take a few years and in the meantime the clueless folks will jump up and down claiming AI will take over the world.
hph6203
How long do you want to ignore this user?
AG
AI is already getting cheaper.
infinity ag
How long do you want to ignore this user?
hph6203 said:

AI is already getting cheaper.


My friend. Cheaper but for how long? Remember that everyone has to make money here else the entire thing collapses. AI companies are propped up by investors who will soon want to see profits, not just revenues. They are being patient for now because they think there will be a reward but clients won't be able to pay for what the investors are expecting. That is when the crash will come.

As some above said, you can make a billion dollars by selling $20 for $2 but for how long it will survive?
Windy City Ag
How long do you want to ignore this user?
AG
Quote:

My friend. Cheaper but for how long? Remember that everyone has to make money here else the entire thing collapses. AI companies are propped up by investors who will soon want to see profits, not just revenues. They are being patient for now because they think there will be a reward but clients won't be able to pay for what the investors are expecting. That is when the crash will come


Cue the greatest Russ Hanneman line in Silicon Valley.

Deputy Travis Junior
How long do you want to ignore this user?
Can you share your evidence that customers are pulling back? I haven't seen anything on that - just more reports on blockbuster revenue growth.
Cromagnum
How long do you want to ignore this user?
AG
I hope all the corporations that laid off a bunch of employees in favor of "AI synergies" and "AI cost-savings" get completely publicly dragged with receipts.

https://www.houstonpublicmedia.org/articles/news/business/2026/01/30/542113/dow-layoffs-houston-jobs-ai/
500,000ags
How long do you want to ignore this user?
AG
Makes sense, but at scale, that's really just a shift in CAPEX spend. We aren't discussing what the smart technical play is that an engineer will do, we are talking about what will happen at scale with large and SMB enterprise. I saw the new NVIDIA laptops, maybe that's a partial solution. But, to me, I haven't seen anything that says we can now shift to compute locally and 70% of enterprise users / non-technical users go that route. If economics are undeniable, agreed, the market will get there, but I'd have to see evidence of that first.
YouBet
How long do you want to ignore this user?
AG
Cromagnum said:

I hope all the corporations that laid off a bunch of employees in favor of "AI synergies" and "AI cost-savings" get completely publicly dragged with receipts.

https://www.houstonpublicmedia.org/articles/news/business/2026/01/30/542113/dow-layoffs-houston-jobs-ai/


While I've always harped on here about AI taking away jobs...and it has...we are absolutely seeing some companies simply use it for air cover to cut dead wood and pursue the new emphasis on profitability that became popular again post COVID. Pre-COVID corporate America was all about revenue growth.
hph6203
How long do you want to ignore this user?
AG
Anthropic is reportedly set to make an operating profit this quarter. They are not selling $20 of tokens for $2. That is a misunderstanding of their finances derived from a population of people that think CNBC is a good source of information.

What these model companies are doing is buying compute, building a model, shipping that model, and delivering it with a portion of the compute they trained it on. Buying more compute, building a better/more expensive model. The prior model, if left to its own operation, is a profitable endeavor. The new model expense is where the losses come from. That pattern is going to roughly continue until the models asymptotically approach the limits of capability given existing data sets. Eventually it will reach what we call AGI of digital intelligence, i.e. capable of doing basically every job that involves a computer. That will consume a substantial, but not total, proportion of existing jobs. The market will expand and the net effect is labor shortage, not job shortage.

The model companies will then focus most of their resources on physical world observation through machines and robots, generating their own data in an ever expanding quantity.

On the open weight side the models are being reduced in parameters so that they can run locally on lessor hardware with some incremental quality, but still usable on high end hardware ($5,000+). That $5,000 hardware will cost $2,000 in less than 5 years. That $5,000 hardware can produce $40,000 worth of tokens annually from models that were at the frontier 12-18 months ago. And that hardware is likely to have a 5 year utility, because model size for equivalent quality is falling.

AI is getting cheaper. It will continue to get cheaper. Consumer hardware has gotten better. It will continue to get better, and despite recent price increases they will eventually fall as production begins to match the new elevated demand and prices will normalize/fall relative to past capabilities. It is the nature of technology.
hph6203
How long do you want to ignore this user?
AG
It's not a now thing. It's an eventuality thing. RAM does not cost the current price paid for it, the current price is a consequence of shortage relative to demand. That will likely persist for some time, but in 5 years you could envision a stabilization/return to previous pricing trends where that Nvidia laptop with 128gb of ram is running a quantized version of a leading model that was released 2-3 years from now, and sells for $2500 instead of whatever ungodly ($6000?) price it's going to fetch in the current market. People are already implementing that strategy with their individual agents.

AI is not currently at a state of maturity that you'd even want every employee implementing it into their job, but in 5 years the infrastructure around it will undoubtedly support that kind of use.
JB99
How long do you want to ignore this user?
AG
hph6203 said:

It's not a now thing. It's an eventuality thing. RAM does not cost the current price paid for it, the current price is a consequence of shortage relative to demand. That will likely persist for some time, but in 5 years you could envision a stabilization/return to previous pricing trends where that Nvidia laptop with 128gb of ram is running a quantized version of a leading model that was released 2-3 years from now, and sells for $2500 instead of whatever ungodly ($6000?) price it's going to fetch in the current market. People are already implementing that strategy with their individual agents.

AI is not currently at a state of maturity that you'd even want every employee implementing it into their job, but in 5 years the infrastructure around it will undoubtedly support that kind of use.


Agreed, but not 5 years. More like 1 year. It's actively happening in some departments mostly IT and engineering related, but it's expanding. The bottleneck is not so much the models it's the governance and the plumbing so you can manage the agents and they have access to all your data and systems so they aren't isolated sandboxes.
infinity ag
How long do you want to ignore this user?
hph6203 said:

Anthropic is reportedly set to make an operating profit this quarter. They are not selling $20 of tokens for $2. That is a misunderstanding of their finances derived from a population of people that think CNBC is a good source of information.

What these model companies are doing is buying compute, building a model, shipping that model, and delivering it with a portion of the compute they trained it on. Buying more compute, building a better/more expensive model. The prior model, if left to its own operation, is a profitable endeavor. The new model expense is where the losses come from. That pattern is going to roughly continue until the models asymptotically approach the limits of capability given existing data sets. Eventually it will reach what we call AGI of digital intelligence, i.e. capable of doing basically every job that involves a computer. That will consume a substantial, but not total, proportion of existing jobs. The market will expand and the net effect is labor shortage, not job shortage.

The model companies will then focus most of their resources on physical world observation through machines and robots, generating their own data in an ever expanding quantity.

On the open weight side the models are being reduced in parameters so that they can run locally on lessor hardware with some incremental quality, but still usable on high end hardware ($5,000+). That $5,000 hardware will cost $2,000 in less than 5 years. That $5,000 hardware can produce $40,000 worth of tokens annually from models that were at the frontier 12-18 months ago. And that hardware is likely to have a 5 year utility, because model size for equivalent quality is falling.

AI is getting cheaper. It will continue to get cheaper. Consumer hardware has gotten better. It will continue to get better, and despite recent price increases they will eventually fall as production begins to match the new elevated demand and prices will normalize/fall relative to past capabilities. It is the nature of technology.


OK, let's see when it really and actually happens in a sustainable way. Everyone is watching.
infinity ag
How long do you want to ignore this user?
Cromagnum said:

I hope all the corporations that laid off a bunch of employees in favor of "AI synergies" and "AI cost-savings" get completely publicly dragged with receipts.

https://www.houstonpublicmedia.org/articles/news/business/2026/01/30/542113/dow-layoffs-houston-jobs-ai/


I am right there wif ya, buddy!

Tar and feather them.

How nice, everyone is singing my tune these days. Took a lot of work and had to deal with a lot of insults, but folks are seeing the truth for themselves.
hph6203
How long do you want to ignore this user?
AG
You wouldn't recognize it, because you're so balls deep into the idea you're smarter than everyone around you and CEOs are the dumbest of those in the crowd.
500,000ags
How long do you want to ignore this user?
AG
We'll see. I think local agents sounds more promising than the crap vertical now, but I'll need to see more.

Even if that story is right, that tanks Anthropic, unless they change monetization. If every 10x engineer is suddenly running open source on their own hardware, what am I missing?
JB99
How long do you want to ignore this user?
AG
500,000ags said:

We'll see. I think local agents sounds more promising than the crap vertical now, but I'll need to see more.

Even if that story is right, that tanks Anthropic, unless they change monetization. If every 10x engineer is suddenly running open source on their own hardware, what am I missing?


They shift to other areas. They can charge for the harnesses, the orchestration, governance, observability. Just the overall wrapper that makes it usable. Like anthropic has claude cowork which comes with a set of plug-ins catered for knowledge work. They came out with Claude design which is more focused on visual design. Google is coming out with autonomous agent interface called spark. I don't think the private models go away, but they become specialized. For example you give the private model a goal. It then creates a workflow with sub agents that use local models, scripts, HITL, etc... The private model builds and orchestrates the workflow.
Logos Stick
How long do you want to ignore this user?
No one is offering open source. Open weight is not open source. And open weight is an answer to address privacy and control, not compute and energy.
500,000ags
How long do you want to ignore this user?
AG
Super muddy economics. Anthropic hasn't raised $125Bn to hand off to open source - even strategically IMO. If I'm being honest, I hear an engineer optimizing his own workflow. Not nearly what will happen at scale. Just my two cents.
500,000ags
How long do you want to ignore this user?
AG
Open weight open source, literally doesn't matter. Anthropic wants usage. Their investors want usage. Hyperscalers want usage. Blackstone, Google, etc is not spending $Tns for nothing but usage.
hph6203
How long do you want to ignore this user?
AG
The size of demand.
WestHoustonAg79
How long do you want to ignore this user?
So I just want to make sure I hear you correctly…..

You're amazing at pasting snippets of news articles and just saying "see… this is truth".

Your opinion. Is AI compute and overall usage will not get cheaper? Is that correct?
infinity ag
How long do you want to ignore this user?
WestHoustonAg79 said:

So I just want to make sure I hear you correctly…..

You're amazing at pasting snippets of news articles and just saying "see… this is truth".

Your opinion. Is AI compute and overall usage will not get cheaper? Is that correct?


It depends.
AI companies like Meta, Google etc have sunk zillions into AI infra and smart people to build "AI". That cost needs to be recouped. There are only a few AI companies that create something worthwhile, so they have a lot of power there. You either pay up or go AI-less. So they are going to give people deep discounts to get them addicted. Once they are stuck, they are going to start gouging. The clients are then stuck, they better pay up or go out of business as they would have fired all employees to please their investors.

Now as an individual, I am addicted to ChatGPT myself, I don't use Google anymore.

So I am not sure it is going to get that much cheaper, they are going to need to get their costs back so why would they reduce prices?
500,000ags
How long do you want to ignore this user?
AG
Cheaper as in cost per query or overall costs? I have 100% zero issue going on record that companies with high adoption will have overall much higher costs than today. $ out the door.
hph6203
How long do you want to ignore this user?
AG
That is not how they maximize profits.
YouBet
How long do you want to ignore this user?
AG
The pricing models are for sure going to be interesting with this on where they land.
Logos Stick
How long do you want to ignore this user?
infinity ag said:

WestHoustonAg79 said:

So I just want to make sure I hear you correctly…..

You're amazing at pasting snippets of news articles and just saying "see… this is truth".

Your opinion. Is AI compute and overall usage will not get cheaper? Is that correct?


It depends.
AI companies like Meta, Google etc have sunk zillions into AI infra and smart people to build "AI". That cost needs to be recouped. There are only a few AI companies that create something worthwhile, so they have a lot of power there. You either pay up or go AI-less. So they are going to give people deep discounts to get them addicted. Once they are stuck, they are going to start gouging. The clients are then stuck, they better pay up or go out of business as they would have fired all employees to please their investors.

Now as an individual, I am addicted to ChatGPT myself, I don't use Google anymore.

So I am not sure it is going to get that much cheaper, they are going to need to get their costs back so why would they reduce prices?


You've obviously never been through previous waves of cognitive labor displacement. When ERP systems replaced human labor, for example, the ERP vendors didn't react with "Ah Ha! Now we'll gouge them because they depend on us for critical information processing that they no longer have humans to do". That's not how it works!!

Every major technological shift is a do or die implementation decision by every company. I remember the founder/CEO of a previous software company I worked for when asked "what business are we in?". He said "the survival business. If companies don't implement our software, they won't survive". They had little choice in the matter, but that doesn't mean we were ever in a position to simply charge whatever we wanted, as you're claiming. The same goes for every other company.

Also, you claim AI can't replace human cognitive labor, but then say it does because the companies fire a bunch of folks - meaning AI can do the work, and do it so well that without it, the company would go bankrupt. LoL
JB99
How long do you want to ignore this user?
AG
infinity ag said:

WestHoustonAg79 said:

So I just want to make sure I hear you correctly…..

You're amazing at pasting snippets of news articles and just saying "see… this is truth".

Your opinion. Is AI compute and overall usage will not get cheaper? Is that correct?


It depends.
AI companies like Meta, Google etc have sunk zillions into AI infra and smart people to build "AI". That cost needs to be recouped. There are only a few AI companies that create something worthwhile, so they have a lot of power there. You either pay up or go AI-less. So they are going to give people deep discounts to get them addicted. Once they are stuck, they are going to start gouging. The clients are then stuck, they better pay up or go out of business as they would have fired all employees to please their investors.

Now as an individual, I am addicted to ChatGPT myself, I don't use Google anymore.

So I am not sure it is going to get that much cheaper, they are going to need to get their costs back so why would they reduce prices?


You don't want your addicts to OD and go out of business. They'll charge just enough.
JB99
How long do you want to ignore this user?
AG
500,000ags said:

Cheaper as in cost per query or overall costs? I have 100% zero issue going on record that companies with high adoption will have overall much higher costs than today. $ out the door.


Im not following this logic
ErnestEndeavor
How long do you want to ignore this user?
hph6203 said:

Anthropic is reportedly set to make an operating profit this quarter. They are not selling $20 of tokens for $2. That is a misunderstanding of their finances derived from a population of people that think CNBC is a good source of information.


CNBC is the one of many incessantly hyping them. Most AI providers have switched to API billing precisely because the subscription fees were not even close to covering actual token costs. For Claude it was something like $1300 of tokens given out per $100 subscription. Entirely unsustainable.

https://she-llac.com/claude-limits?ref=wheresyoured.at

Quote:


What these model companies are doing is buying compute, building a model, shipping that model, and delivering it with a portion of the compute they trained it on. Buying more compute, building a better/more expensive model. The prior model, if left to its own operation, is a profitable endeavor. The new model expense is where the losses come from. That pattern is going to roughly continue until the models asymptotically approach the limits of capability given existing data sets.


Already there. Scaling is largely dead. XAi threw 10x the amount of compute at their latest Grok model for only modest improvement in benchmarks over the prior version. The jumps we saw from GPT2 to GPT3 and from GPT3 to GPT4 were not repeated for GPT5. In fact, GPT 5 was delayed because they barely got much improvement out of several rounds of training and ended up releasing it as 4.5. When they finally did release GPT-5 Sam Altman dropped all pretenses about it being AGI, which is what he was bragging about back in 2024. It was only a modest improvement from 4.5. The scaling is no longer scaling. They already have all the data from the internet.

Almost every bit of improvement we've seen over the last year or so of model releases has been in the wrappers, the code that actually calls and harnesses the LLM.

Quote:


Eventually it will reach what we call AGI of digital intelligence, i.e. capable of doing basically every job that involves a computer. That will consume a substantial, but not total, proportion of existing jobs. The market will expand and the net effect is labor shortage, not job shortage.


They don't have the technology to do this. LLMs are not capable of understanding why they are doing anything. That's just baked into the design. Reasoned decision making without a clear right or wrong answer is impossible for an LLM. Training an LLM more won't work. Jobs involve reasoned decision making. In order to accomplish this they would need a completely different type of AI.

Quote:


The model companies will then focus most of their resources on physical world observation through machines and robots, generating their own data in an ever expanding quantity.


There are several companies now working on different types of AI that are specialized using local world models. If you are looking for something approaching AGI for a specific use case this is where it would likely come from.

Using LLMs to train other LLMs is a recipe for devolution. LLMs don't know a right answer from a wrong one. New LLMs would be training on the hallucinated data created by the prior one.

Quote:


On the open weight side the models are being reduced in parameters so that they can run locally on lessor hardware with some incremental quality, but still usable on high end hardware ($5,000+). That $5,000 hardware will cost $2,000 in less than 5 years. That $5,000 hardware can produce $40,000 worth of tokens annually from models that were at the frontier 12-18 months ago. And that hardware is likely to have a 5 year utility, because model size for equivalent quality is falling.


Which is a disaster for the big AI companies because anything that can run locally they can't bill for.

Quote:


AI is getting cheaper. It will continue to get cheaper. Consumer hardware has gotten better. It will continue to get better, and despite recent price increases they will eventually fall as production begins to match the new elevated demand and prices will normalize/fall relative to past capabilities. It is the nature of technology.


OpenAi, anthropic, SpaceX and the others investing heavily in AI will have to generate ungodly amounts of revenue to cover the insane capex. They aren't going to be able to do that by lowering prices.
ErnestEndeavor
How long do you want to ignore this user?
BTW I say that not to poop on the use cases of generative AI and LLMs. They can be very useful at certain tasks and have worth. I'm just not seeing the trillions in valuations or path to profitability.
YouBet
How long do you want to ignore this user?
AG
Quote:

They already have all the data from the internet.


Not so fast, my friend. I've been holding back on a few posts that are going to cause evolutionary leaps in intelligence for these AI models. We are talking true Turing level intelligence. So, let's hold off for a bit on your proclamation.

Hint: it's not this post...you will know it when you see it. Will likely be on the East Texas board when I release these posts.
infinity ag
How long do you want to ignore this user?
Here's an AI joke!
Quote:


Aaron Cannon
@AaronLCannon
I started talking to my co-founder like Claude


500,000ags
How long do you want to ignore this user?
AG
You can't say frontier AI developers thrive when CAPEX and off balance sheet compute spend are at almost unprecedented scale, while also saying things in the system get materially cheaper very quickly for end users. Will things get cheaper per use, yes, will total spend also grow exponentially to try and make product and profit, also yes, or by definition, frontier developers will not survive. The "operating profit" that Anthropic reported has to be taken with a grain or salt, unlike other tech companies, because operating leverage has not been proven IMO. If Tokens are cheaper, then adoption by current users to new use cases or new users has to explode, or it's not enough revenue to cover the CAPEX spend in the system.

The current vertical is incredibly in flux and if these $1Tn IPOs don't work out, oh boy. Even if they do work, we are talking material capital reallocation since we have potentially 3 companies coming to IPO at almost $3Tn. That is a shock to the system that likely is a headwind on other stocks even.

Unit economics, CAPEX forecast, depreciation, and delay, and successful IPO are all much more crucial to the health of the vertical over the next few years than relying on an unbridled demand story. Which I also think isn't necessarily true even.
hph6203
How long do you want to ignore this user?
AG
1) Saying the average Claude user is using $1300 in tokens on a $100 subscription is like saying the average net worth in the U.S. is $1 million. Pushing consumption to the appropriate channel is not raising prices. Just totally skipped over the operating profit this quarter part.

2) The claimed improvements of Mythos are not incremental. We will see if it lives up to claims.

3) The common theme of people with respect to LLMs is an overestimation of what people do when they think, and especially an overestimation of what the norm person thinks when they do their job. Most jobs do not require reasoned thinking to the level you think they do, and yes LLMs can reason.

An entire bank is automate-able by the current capabilities of LLMs with the appropriate guardrails. As are mortgage companies. Customer service for most companies too. There's a lot that is already

4) Generate their own data does not mean LLM output feeding into LLM training. It means companies are going to do what Tesla is doing for driving. Sticking sensors in everything, accumulating data, creating models. Humanoid robots being one of them.

5) No, it's not. It's a guardrail to behaving in the way you (and Infinity) claim they will behave. Linux exists. It is an incredibly popular operating system in the broad sense, but in the narrow sense of personal computers it is damn near irrelevant compared to MacOS, iOS, Android and Windows. Why?

6) Look around you. At every piece of technology. Is it more or less expensive than it was when it was introduced? It is a near universal that it is wildly, wildly cheaper. Unfathomably cheaper actually. That is the nature of technology. That is happening with LLMs as well. Incredibly fast. LLMs will get cheaper.

Look at the top revenue companies in the world. How many of them are providers targeting big money spenders? The answer is functionally none. Raising prices is not how these model makers become the largest companies in the world. They get there by being ubiquitous. Crucial to life.

Anthropic/OpenAI do not need immediate profitability to be successful and profitable businesses long term. As they continue to do what they're doing the money spend on training the next model levels off, and they end up with a backlog of massive compute that allows them to lower prices (for equivalent capability) and increase consumption by varying capabilities of the model.

Hopper will get retired to inference (appears to be already), Blackwell will get retired to inference near the end of Rubin, eventually Rubin will get retired to inference when next gen architectures become the most efficient training hardware. Proportion dedicated to serving tokens rather than training models will heavily favor serving over training over time. That's how equivalent capability pricing falls, i.e. you get Opus 4.8 equivalency at a fraction of the cost.

That cycle will explode in scale when the friction of terrestrial data centers is removed by larger scale operations from chip fabs and systematized deployments into orbit for compute.
JB99
How long do you want to ignore this user?
AG
That's a good point about Linux vs. Windows. Consumers will keep using the private models. Is easy and packaged for them. Enterprises will opt for the open weight because they have the know how to implement it
 
×
subscribe Verify your student status
See Subscription Benefits
Trial only available to users who have never subscribed or participated in a previous trial.