Submitted by Tea_Pearce t3_10aq9id in MachineLearning
mugbrushteeth t1_j45xihj wrote
Reply to comment by ml-research in [D] Bitter lesson 2.0? by Tea_Pearce
One dark outlook on this is the compute cost reduces very slowly (or does not reduce at all), the large models become the ones that only the rich can run. And using the capital that they earn using the large models, they reinvest and further accelerate the model development to even larger models and the models become inaccessible to most people.
dimsycamore t1_j46jj4p wrote
Already happening unfortunately
anonsuperanon t1_j47g6e3 wrote
Literally just the history of all technology, which suggests saturation given enough time.
currentscurrents t1_j4702g0 wrote
Compute is going to get cheaper over time though. My phone today has the FLOPs of a supercomputer from 1999.
Also if LLMs become the next big thing you can expect GPU manufacturers to include more VRAM and more hardware acceleration directed at them.
RandomCandor t1_j47bx4j wrote
To me, all that means is that the lay people will always be a generation behind from what the rich can afford to run
currentscurrents t1_j48csbo wrote
If it is true that performance scales infinitely with compute power - and I kinda hope it is, since that would make superhuman AI achievable - datacenters will always be smarter than PCs.
That said, I'm not sure that it does scale infinitely. You need not just more compute but also more data, and there's only so much data out there. GPT-4 reportedly won't be any bigger than GPT-3 because even terabytes of scraped internet data isn't enough to train a larger model.
BarockMoebelSecond t1_j48mepq wrote
Which is and has been the Status Quo for the entire history of computing, I don't see how that's a new development?
currentscurrents t1_j490rvn wrote
It's meaningful right now because there's a threshold where LLMs become awesome, but getting there requires expensive specialized GPUs.
I'm hoping in a few years consumer GPUs will have 80GB of VRAM or whatever and we'll be able to run them locally. While datacenters will still have more compute, it won't matter as much since there's a limit where larger models would require more training data than exists.
Playful_Ad_7555 t1_j49k8p2 wrote
silicon computing is already very close to its limit based on foreseeable technology. the exponential explosion in computing power and available data from 2000-2020 isnt going to be replicated
Opposite-Platypus-99 t1_j4ahpg6 wrote
now, can you confirm you can run arbitrary software on your phone?
bloc97 t1_j49ft0g wrote
My bet is on "mortal computers" (term coined by Hinton). Our current methods to train Deep Nets are extremely inefficient. CPU and GPUs basically have to load data, process it, then save it back to memory. We can eliminate this bandwidth limitation by printing basically a very large differentiable memory cell, with hardware connections inside representing the connections between neurons, which will allow us to do inference or backprop in a single step.
gdiamos t1_j4a96pu wrote
Currently we have exascale computers, e.g. 1e18 flops at around 50e6 watts.
The power output of the sun is about 4e26 watts. That's 20 orders of magnitude on the table.
This paper claims that energy of computation can theoretically be reduced by another 22 orders of magnitude. https://arxiv.org/pdf/quant-ph/9908043.pdf
So physics (our current understanding) seems to allow at least 42 orders of magnitude bigger (computationally) learning machines than current generation foundation models, without leaving this solar system, and without converting mass into energy...
Viewing a single comment thread. View all comments