limapedro
limapedro t1_j13qfxr wrote
The cheaper option would be to run on 2 RTX 3060s! Each GPU costing 300 USD you could buy two for 600ish! Also there's a 16 GB A770 from Intel! To run a very large model you could split the weights into so called blocks, I was able to test it to myself in a simple keras implementation, but the code for conversion is hard to write, although I think I've seen somewhere something similar from HuggingFace!
limapedro t1_j175nby wrote
Reply to comment by maizeq in [D] Running large language models on a home PC? by Zondartul
No, I haven't! Although in theory it should be really good, you could still run Deep Learninig using Directml, but a native implemenation should be really fast because of its XMX cores, they're similar to Tensor Cores.