[R] Getting GPT-3 quality with a model 1000x smaller via distillation plus Snorkel Submitted by bradenjh t3_z26fui on November 22, 2022 at 9:59 PM in MachineLearning 9 comments 23
visarga t1_ixggjfm wrote on November 23, 2022 at 6:42 AM > Has anyone else tried something similar? Trying it right now, but instead of using GPT-3 I am splitting the data like cross-validation and training ensembles of models. Ensemble disagreement =~ error rate. Permalink 2
Viewing a single comment thread. View all comments