Submitted by Vegetable-Skill-9700 t3_121agx4 in deeplearning
BrotherAmazing t1_jdn10wi wrote
I can’t speak directly to the question posed, but I have often observed people/groups that either:
-
Overparametrize the model and then uses regularization as needed to avoid overfitting
-
Underparametrize a “baseline” prototype, then work their way up to a larger model until it meets some performance requirement on accuracy, etc.
Time and time again I have seen approach 2 lead to far smaller models that train and run much faster and sometimes yield better test set results than approach 1 depending on the data available during training. I have, of course, seen approach 1 perform better than approach 2 at times, but if you have an accuracy requirement and ramp up the model complexity in approach 2 until you meet/exceed it, you still met your requirement and end up with a smaller faster to run/train model.
Viewing a single comment thread. View all comments