Viewing a single comment thread. View all comments

BrotherAmazing t1_jdn10wi wrote

I can’t speak directly to the question posed, but I have often observed people/groups that either:

  1. Overparametrize the model and then uses regularization as needed to avoid overfitting

  2. Underparametrize a “baseline” prototype, then work their way up to a larger model until it meets some performance requirement on accuracy, etc.

Time and time again I have seen approach 2 lead to far smaller models that train and run much faster and sometimes yield better test set results than approach 1 depending on the data available during training. I have, of course, seen approach 1 perform better than approach 2 at times, but if you have an accuracy requirement and ramp up the model complexity in approach 2 until you meet/exceed it, you still met your requirement and end up with a smaller faster to run/train model.

4