xl0
xl0 t1_ixegyz5 wrote
Don't just "implement" the models - implement the training loop in "pure" PyTorch, including mixed precision, gradient accumulation and metrics. It's not super hard but gives much-needed insight into why higher-level frameworks (like fastai or lightning) do things the way they do them.
And then actually get the models to train and see if you can replicate the results in the paper, at least some of them. You can train on smaller datasets like imaganette instead of imagenet if you don't have resources. If you can spend some money, vast.ai is good for relatively long-running tasks.
xl0 OP t1_ivj5slm wrote
Reply to comment by patrickkidger in [P] Lovely Tensors library by xl0
I started working on it. Will make sure repr works inside jit and parallel before moving to other things.
https://github.com/xl0/lovely-jax
Please let me know if you have any thoughts, I'm very new to JAX.
xl0 OP t1_ivd0rca wrote
Reply to comment by patrickkidger in [P] Lovely Tensors library by xl0
Haha, thank you! You are not the first person to mention JAX, so I guess I'll do a JAX version next. :)
I have a rough idea of what it is, and as I understand it, it's more about transforming functions. Do you have ideas about anything JAX-specific that should be included in the ndarray summary?
Submitted by xl0 t3_ynheaf in MachineLearning
xl0 t1_ixekt7a wrote
Reply to comment by itsstylepoint in [D] What advanced models would you like to see implemented from scratch? by itsstylepoint
Cool, had a glance at a couple of your videos. They are pretty good, the production quality is good enough, and the explanations are clear.
One suggestion - maybe you could use notebooks? Can't overstate the importance of being able to interact with the code and visualize the data bit by bit, as you are writing the code. It makes it much easier to follow and understand what's going on in the code.