BB4evaTB12 OP t1_izgu9jj wrote on December 9, 2022 at 12:25 AM Reply to comment by Different_Fig4002 in 36% of HellaSwag benchmark contains errors [D] by BB4evaTB12 Totally! We may be thinking of the same example from the GoEmotions dataset, where they mislabeled "Yay, cold McDonald's. My favorite." as Love. Permalink Parent 1
BB4evaTB12 OP t1_izgu9jj wrote
Reply to comment by Different_Fig4002 in 36% of HellaSwag benchmark contains errors [D] by BB4evaTB12
Totally! We may be thinking of the same example from the GoEmotions dataset, where they mislabeled "Yay, cold McDonald's. My favorite." as Love.