Viewing a single comment thread. View all comments

SatisfyingLatte t1_ittn52b wrote on October 26, 2022 at 5:47 AM

Once all the useful representations from the training data has been extracted and learned. Beyond that, increasing model size will overfit the training data. Only language tasks might be solvable by naively scaling current techniques.

ReasonablyBadass t1_ittryn7 wrote on October 26, 2022 at 6:50 AM

Overfitting isn't an issue anymore due to the discovery of double descent/grokking.