What. The. ***k. [less than 1B parameter model outperforms GPT 3.5 in science multiple choice questions]
Submitted by Destiny_Knight t3_118svv7 in singularity
Reply to comment by Bakagami- in What. The. ***k. [less than 1B parameter model outperforms GPT 3.5 in science multiple choice questions] by Destiny_Knight
You are wrong. It’s not experts. It’s randos on mechanical Turk.
rip, they should've included expert performance as well then
You are setting the bar as anything less than perfect is failure.
By that standard, most humans would fail. And most experts are only going to be an expert in one field, not every field, so they would also fail by your standards.
Wtf are you talking about. It's a benchmark, it's to compare performance. I'm not setting any bar, and I'm not expecting it to beat human experts immediately.
Agreed. Stage one was "cogent", stage two was "as good as a human", stage three is "better than all humans". We have already passed stage 2 which could be called AGI. We will soon hit stage 3 which is ASI.
we are a million miles away from AGI.
hey buddy, you might want to check this link -> Dunning-Kruger effect
Is this implying that I don't know anything about AI or that the average person is not knowledge enough to be useful?
But then they wouldn’t be able to say that the AI beats them and it wouldn’t be as flashy of a publication. Don’t you know how academia works?
No. I haven't seen anyone talking about it because it beat humans, it was always about it beating GPT-3 with less than 1B parameters. Beating humans was just the cherry on top. The paper is "flashy" enough, including experts wouldn't change that. Many papers do include expert performance as well, it's not a stretch to expect it.
The human performance number is not from this paper, it is from the original ScienceQA paper. They are they ones that did the benchmarking.
Are you joking or serious ?
Serious, read the paper.
My disappointment is unmeasurable and my day is ruined.
Really? So the time has come where a small-scale AI model being smarter than "ordinary" humans is not impressive.
Awe is so last December - impatience is the new mode. They teased us with the future, now we expect it ASAP!
It's not ordinary humans, it's people on mechanical turk who are paid to do them as fast as possible and for as little money as possible. They are not motivated to actually think that hard.
That's prejudice. You don't know that.
No it is economics, they make less money the longer they stop and think about it.
[deleted]
Viewing a single comment thread. View all comments