Viewing a single comment thread. View all comments

maskedpaki t1_j9ut4v4 wrote

For those wondering about the performance

5 shot performance on MMLU.

Chinchilla 67.5

this new model 68.9

human baseline 89.8

​

so it seems a smidge better than chinchilla on 5 shot MMLU Which many consider to be the important AGI benchmark (its one of the AGI conditions on metaculus)

some nice work by meta.

30

MysteryInc152 t1_j9v4fru wrote

Flan-Palm hits 75 on MMLU. Instruction finetuning/alignment and COT would improve performance even further.

6

Tavrin t1_j9vl37m wrote

Flan-Palm is 540B so there's that

10

maskedpaki t1_j9z7sxs wrote

yes!. the really big breakthrough here is that its on par with the original gpt3 at only 7 billion parameters on a bunch of benchmarks ive seen.

​

that means its gotten 25x more efficient in the last 3 years.

I wonder how efficient these things can get. Like are we going to see a model thats 280 million parameters that rivals original gpt3 in 2026 and a 11 million parameter one in 2029.

3

Baturinsky t1_j9x7xw7 wrote

It seems that it is close to SOTA on 60-70B models. "Only" big deal is that the smaller LLAMA models show results comparable to much bigger SOTAs.

4