You are setting the bar as anything less than perfect is failure.
By that standard, most humans would fail. And most experts are only going to be an expert in one field, not every field, so they would also fail by your standards.
Wtf are you talking about. It's a benchmark, it's to compare performance. I'm not setting any bar, and I'm not expecting it to beat human experts immediately.
Agreed. Stage one was "cogent", stage two was "as good as a human", stage three is "better than all humans". We have already passed stage 2 which could be called AGI. We will soon hit stage 3 which is ASI.
Viewing a single comment thread. View all comments