crrrr30
crrrr30 t1_iqta3tc wrote
Reply to comment by maxToTheJ in [D] Types of Machine Learning Papers by Lost-Parfait568
using a single lego block WITH different optimizer, lr schedule, and augmentations…
crrrr30 t1_iqpciu1 wrote
Reply to [Discussion] If we had enough memory to always do full batch gradient descent, would we still need rmsprop/momentum/adam? by 029187
I feel like with that memory available, testing scaling laws is a better research direction than testing full batch
crrrr30 t1_iqpbup1 wrote
Reply to comment by National-Tennis-4528 in [D] Why is the machine learning community obsessed with the logistic distribution? by cthorrez
Stretched exponential as in linearly scaling it, like exp(k*x) for some constant k? Maybe exponential functions grow too fast and there might be gradient explosion
crrrr30 t1_j1t7ybg wrote
Reply to comment by skn133229 in [D] Normalized images in UNET by skn133229
perhaps also try random gamma or something like that? contrast could be an issue.