Submitted by madmax_br5 t3_10mbct5 in MachineLearning
PassingTumbleweed t1_j62bzdk wrote
Reply to comment by madmax_br5 in [D] Moving away from Unicode for more equal token representation across global languages? by madmax_br5
You can totally do that. There are tricks to reduce memory usage, too, such as the embedding factorization used in ALBERT.
The best part is, none of these options are precluded by Unicode. Unicode in fact has nothing to do with it!
madmax_br5 OP t1_j62d75y wrote
I get that now, thanks! Not an ML expert so this is very helpful!
Viewing a single comment thread. View all comments