Submitted by Pro_RazE t3_11aunt2 in singularity
maskedpaki t1_j9ut4v4 wrote
For those wondering about the performance
5 shot performance on MMLU.
Chinchilla 67.5
this new model 68.9
human baseline 89.8
​
so it seems a smidge better than chinchilla on 5 shot MMLU Which many consider to be the important AGI benchmark (its one of the AGI conditions on metaculus)
some nice work by meta.
MysteryInc152 t1_j9v4fru wrote
Flan-Palm hits 75 on MMLU. Instruction finetuning/alignment and COT would improve performance even further.
Tavrin t1_j9vl37m wrote
Flan-Palm is 540B so there's that
maskedpaki t1_j9z7sxs wrote
yes!. the really big breakthrough here is that its on par with the original gpt3 at only 7 billion parameters on a bunch of benchmarks ive seen.
​
that means its gotten 25x more efficient in the last 3 years.
I wonder how efficient these things can get. Like are we going to see a model thats 280 million parameters that rivals original gpt3 in 2026 and a 11 million parameter one in 2029.
Baturinsky t1_j9x7xw7 wrote
It seems that it is close to SOTA on 60-70B models. "Only" big deal is that the smaller LLAMA models show results comparable to much bigger SOTAs.
Viewing a single comment thread. View all comments