Viewing a single comment thread. View all comments

crrrr30 t1_iqpciu1 wrote

I feel like with that memory available, testing scaling laws is a better research direction than testing full batch

1