TheRealSerdra t1_iwo4w46 wrote
Reply to comment by ReasonablyBadass in [R] Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning - Epochai Pablo Villalobos et al - Trend of ever-growing ML models might slow down if data efficiency is not drastically improved! by Singularian2501
Technically aren’t you always doing at least one epoch? You’re doing one pass through of all your data at least, even if that data is less than the amount you theoretically could use
ReasonablyBadass t1_iwoq0ug wrote
Not a complete one. GPT-3,I think, didn't complete it's first pass-through
zzzthelastuser t1_iwpi7r5 wrote
You could argue GPT-3 was trained on a subset of the available training data, no?
Not completing the first pass-through means the remaining data could be considered as not part of the training data.
ReasonablyBadass t1_iwplk0c wrote
Semantics. It didn't see any of it's data more than once and it had more available. Not one full epoch.
zzzthelastuser t1_iwpltkw wrote
Sure, but in theory my little Hello World network had also more data available on the internet.
Viewing a single comment thread. View all comments