Viewing a single comment thread. View all comments

TheWittyScreenName t1_iz9qnlv wrote

I don’t think anyone’s mentioned the Adam optimization paper yet. 99% of deep learning models just use it by default without even thinking about it, so I’d say it’s pretty foundational

14