hmm, I will register the prediction: test-time scaling probably means the models will be open souce, not closed source
it shifts the costs from capex to opex, which is not beneficial for the closed source labs. meta & others are likely going to continue to release open source models in order to commoditize their compliment (or think they're commoditizing their compliment). if training scaled, then you could imagine a point in which meta throws up their hands and says "even this capex is too much for us". But if there's a cap in which the expense switches from capex, to opex, then the os models will probably catch up relatively quickly and meta will let the market fight against each other to race to the bottom for the inference of test-time scaling
for the moment, oai / etc have a pretty good window here, and they probably add productization around it to continue to be the market leader for quite a while
there's a belief i've heard over that this will put us into fast take-off, and somehow this will translate into the big labs capturing all the value. but this doesn't really make that much sense to me, fast take off means you have less time to capture value, not more in the world in which capex is fixed and opex scales. you don't have any compounding advantages if you're running away, but can't reinvest the money into capex (such that other people can't catch up).
it seems to me like we are in the "ai is everywhere" scenario now, where lots and lots of open source or cheap models will exist, but be run for longer
this is a big update for me, i assumed that scaling laws in training would continue for a very long time (maybe it still will?) and that inference would not have scaling laws, and thus the game was "make it eventually cost so much money that no one would dare compete with you".
but no, we seem to be in a different timeline
Conversation
tbh, i am not like "pro" open source models exactly, and I'm actually quite pro OAI; they were (and are) who I assumed would win
but if i were going to bet, i would bet that there's a good, test-time scaling model within a year, and most of the tokens in the world shifts to
Show more
Future of training also belongs to open-source because of the rate of algorithmic improvements in the AI community - 50% reduction in cost every 8 months. This means that
1. Even a huge capex investment only gives a modest head-start
2. Maintaining closed-source edge remains
Show more
Is your intuition that algorithmic improvements will increase? Mine was “eventually all the juice that can be squeezed will be squeezed and all that will remain is more compute / data”
I publish technical analysis on USA Stocks for Free. Do follow my channel to become a more intelligent trader & investor. No Charges!!
0:04
It is also the case that we've seen that Google and OpenAI can price their fairly performant models on par with 8b models from vendors like together AI. So there might also been large infra advantage from closed labs that can run these models at scale. So while technically you
Show more
Scaling pre-training compute and scaling inference compute are almost certainly complementary.
Larger models will take fewer steps to reach a solution (making them cheaper for difficult questions), but will also be able to make use of longer CoTs (making them much stronger)
If o3 was trained using synthetic data, then you still have a large upfront cost to train the next generation. It also unlocks completely new demand for compute because you can just pay (exponentially) more for (marginally) better performance, which will be very good for Google
Show more
Model training will still be valuable for specialized models, even if slowdown happens for the frontier models.
I usually follow the money — if I see the big labs optimizing for inference (I.e investing into asics > gpus) then I’ll believe they changed strategies.
Right now
Show more
Inference-as-labor (rather than model-as-factory) actually makes a lot of intuitive sense when you think about what AGI could be replacing. And it’s probably healthier for society that value ends up accruing to many makers of agents, vs. a couple makers of models.
imagine a world where everyone has their own personalized ai agent. that's where this is headed
do you know what the communication model is like for the o-* family? Large latencies between compute nodes is bad for pre-training but i wonder if it is ok for test time
The GraniteShares $AMDL ETF lets you leverage potential growth in the chip maker’s stock.
Investment not suitable for all investors. Investing involves significant risk. Please see linked prospectus.
buff.ly/3BYeY5k
I also feel that CoT as the universal interface (as opposed to model-specific latent reasoning), looser latency requirements, and search parallelism mean that it will be easier to distribute inference among many many devices.
If time to reason become dominant, I would imagine that distributed gpu infrastructure and gpus at the edge will be the market’s response. If AI is everywhere, then GPUs will be everywhere. This will lead to even greater pressure on sovereign AI.
Single colocation and mega
Show more
One of the most important features of test-time scaling is its ability to generate synthetic data for training better models, which turns it from opex back to capex.
Despite a greater proportion of the cost moving to variable cost, OpenAI will just shift the cost onto the customer rather than open source.
Even for users who have the option of an open source model, convenience is above all. Hence, per traffic, platform > API > self-hosted.
On the training side, these CoT models likely require an abundance of expensive multi step labels generated by humans and additional compute to think across multiple cycles. I don't think efforts to scale these models and the accompanying costs will end any time soon
Wouldn't bigger pretrained models run inference more efficiently.
It sounds like you’re implying that open sourcing llama is not on Meta’s interest? I’m really curious about this. Could you please expand on this?
Interesting take, makes sense. Marginal cost will be a big deal. OpenAi has a lot of compute but guarantee they don't have the cheapest. Also if you own whole stack scaling will kill you.
Interesting take. The shift to test-time compute costs could actually accelerate open source adoption, especially as cloud providers optimize for inference.
The BITX ETF seeks 2x leveraged exposure to bitcoin futures. BITX seeks to benefit from increases in the price of bitcoin. Learn how to invest today at Volatility Shares and Click here for fund disclosures: bit.ly/3CEfUZz
Show additional replies, including those that may contain offensive content