FirstOrderCat
FirstOrderCat t1_iywiisx wrote
Reply to [D] NeurIPS 2022 Outstanding Paper modified results significantly in the camera ready by Even_Stay3387
very interesting case.
Huge respect to everyone who is involved for still good results and transparent process!
FirstOrderCat t1_iyw8tuu wrote
Reply to comment by Ambiwlans in bit of a call back ;) by GeneralZain
Here is recent paper, they improved previous SOTA in GSM8K by 2%: 78->80: https://arxiv.org/pdf/2211.12588v3.pdf
​
>Llms are basically doa waiting on gpt4 in a few months now anyways unless they offer something really novel.
why are you so confident? Current gpt is very far from doing any useful work, it can't replace programmer, lawyer, accounter, the is a huge space for improvement before they reach some AGI and replace knowledge workers.
FirstOrderCat t1_iyw6ute wrote
Reply to comment by Ambiwlans in bit of a call back ;) by GeneralZain
I watch NLP/LLM papers, people sure will release arxiv paper and likely apply on conference with few % improvement.
FirstOrderCat t1_iyw487q wrote
Reply to comment by Ambiwlans in bit of a call back ;) by GeneralZain
> have a revolution every few weeks
like +5% on benchmarks detached from real world?
FirstOrderCat t1_iyegvm4 wrote
Reply to comment by Imaginary_Ad307 in What’s gonna happen to the subreddit after the singularity? by Particular_Leader_16
> just like we do with chickens.
but not through moderation of chicken speach.
Singularity may already be secretly achieved, and something like https://en.wikipedia.org/wiki/Aladdin_(BlackRock) already manages, motivates, restrains, penalizes out society.
FirstOrderCat t1_iye5r55 wrote
Reply to comment by Imaginary_Ad307 in What’s gonna happen to the subreddit after the singularity? by Particular_Leader_16
why they would moderate anything? Do we moderate what chickens talk to each other on the farms?
FirstOrderCat t1_iye0d9a wrote
Reply to comment by Shiyayori in What’s gonna happen to the subreddit after the singularity? by Particular_Leader_16
someone will create r/postsingularity, where we will discuss how to survive under machines ruling.
FirstOrderCat t1_iyc30d4 wrote
Reply to [D] I'm at NeurIPS, AMA by ThisIsMyStonerAcount
what are you doing there? what is your goal of visiting this conference?
FirstOrderCat t1_iyc2xqj wrote
Reply to comment by ThisIsMyStonerAcount in [D] I'm at NeurIPS, AMA by ThisIsMyStonerAcount
>Sperbank
sberbank? lol
FirstOrderCat t1_iybm4dy wrote
Reply to comment by Sigura83 in Better Language Models Without Massive Compute by Tom_Lilja
>First in a MMO like setting, then the real world.
nop, this transition is very hard because of following reason:
current wave of AI can approximate giant datasets, that's something it is doing very well. So, all your examples is: they throw terrabites of data on neural network, and it learns patterns. But this kind AI can't generalize and do abstract thinking which means it can't learn from very few examples.
Meaning yes, they can ask AI to play MMO 100 millions times and it will learn from its own mistakes, but you would need to do the same 100 million times in real world, which is not very feasible.
Another issue is that: MMO has much smaller level of freedom than real world, which makes MMO is not a good benchmark.
FirstOrderCat t1_iybkjgi wrote
Reply to comment by maskedpaki in From NeurIPS 2022 poster session: "[Google] Minerva author on AI solving math: IMO gold by 2026 seems reasonable, superhuman math in 2026 not crazy" by maxtility
maybe, but they can't make kids math yet, even if we assume they are not cheating.
FirstOrderCat t1_iybg5rh wrote
Reply to comment by Sigura83 in Better Language Models Without Massive Compute by Tom_Lilja
> it can understand and search pretty dang well from voice alone.
there is component where some model translates your voice to text, but searching part contains tons of human hand-crafted code.
So, current language models are good for some narrow tasks (translation is the main one), but still not on the level of abstract thinking human posses. My bet they won't be able, unless some large advancement.
FirstOrderCat t1_iybelgs wrote
Reply to From NeurIPS 2022 poster session: "[Google] Minerva author on AI solving math: IMO gold by 2026 seems reasonable, superhuman math in 2026 not crazy" by maxtility
next interesting milestone would be if they solve kids level math from gsm8k with 98% accuracy and not 78%.
FirstOrderCat t1_iybdxom wrote
Reply to comment by Sigura83 in Better Language Models Without Massive Compute by Tom_Lilja
>I'm convinced nearly all jobs will be done by AI in 15-20 years now, and not just done, but done better
language models already exceeded human performance in many benchmarks, but they struggle to replace humans at any work, why is that, what do you think?
FirstOrderCat t1_iybcu3j wrote
Reply to comment by Sigura83 in Better Language Models Without Massive Compute by Tom_Lilja
> AIs by 2x
it is by 3% on their graph lol.
FirstOrderCat t1_itc6pne wrote
Reply to comment by Spoffort in U-PaLM 540B by xutw21
It looks like they had point of diminishing return somewhere at 0.5*1e25 FLOPS.
After that model trains much slower. They could continue training farther, and say they "saved" another 20M TPU hours.
FirstOrderCat t1_itapodq wrote
Reply to comment by visarga in U-PaLM 540B by xutw21
the problem is that they still far in the quality to be trusted to solve real problems.
FirstOrderCat t1_itahazn wrote
Reply to comment by AsthmaBeyondBorders in U-PaLM 540B by xutw21
>This model had up to 21% gains in some benchmarks, as you can see there are many benchmarks
Meaning they received less than 2 points in many others..
> it is about a different model which can be as good and better than the previous ones while cheaper to train.
Model is the same, they changed training procedure.
> You seem to know a lot about Google's internal decisions and strategies as of today
This is public information.
FirstOrderCat t1_itaeypy wrote
Reply to comment by AsthmaBeyondBorders in U-PaLM 540B by xutw21
this race maybe over.
On the graph guy is proud of getting 2 points in some synthetic benchmark, while spending 4 millions TPUv4 hours = $12M.
At the same time we hear that Google cuts expenses and considering layoffs, and LLM part of Google Research will be the first in the line, because they don't provide much value in Ads/Search business.
FirstOrderCat t1_itad0zs wrote
Reply to comment by AsthmaBeyondBorders in U-PaLM 540B by xutw21
>The problem is you don't know what emergent skills are yet to be found because we didn't scale enough.
Yes, and you don't know if such skills will be found and we hit the wall or not yet.
FirstOrderCat t1_itacdp4 wrote
Reply to comment by AsthmaBeyondBorders in U-PaLM 540B by xutw21
>A wall is when we can't improve the results of the last LLMs.
The wall is a lack of break through innovations.
Latest "advances" are:
- build Nx larger model
- tweak prompt with some extra variation
- fine-tune on another dataset, potentially leaking benchmark data to training data
- receive marginal improvement in benchmarks irrelevant to any practical task
- call your new model with some epic-cringe name: path-to-mind, surface-of-intelligence, eye-of-wisdom
But none of these "advances" somehow can replace humans on real tasks, with exception to style-transfer of images and translation.
FirstOrderCat t1_itaaaxe wrote
Reply to comment by AsthmaBeyondBorders in U-PaLM 540B by xutw21
> and we haven't hit that wall yet
why do you think so?
FirstOrderCat t1_ita7yel wrote
Reply to comment by AsthmaBeyondBorders in U-PaLM 540B by xutw21
> form but don't forget LLMs are behind stable diffusion, dreamfusion, dreambooth, etc.
But its not discussed AGI, it is more stochastic parroting, or style transferring.
FirstOrderCat t1_it9oqhg wrote
Reply to comment by CommentBot01 in U-PaLM 540B by xutw21
> Currently deep learning and LLM are very successful and not even close to its limit.
to me it is opposite, companies already invested enormous resources, but LLM can solve some simplistic limited scope tasks, and no much AGI-like real applications have been demonstrated.
FirstOrderCat t1_iywkjp6 wrote
Reply to comment by Ambiwlans in bit of a call back ;) by GeneralZain
yes, hand coded automation empowered by LLMs can take many jobs.