Introduction to Reinforcement Learning with Human Feedback [D] Submitted by BB4evaTB12 t3_10a7qmi on January 12, 2023 at 7:07 PM in MachineLearning 6 comments 14
36% of HellaSwag benchmark contains errors [D] Submitted by BB4evaTB12 t3_zff5mh on December 7, 2022 at 9:51 PM in MachineLearning 6 comments 33