Viewing a single comment thread. View all comments

Bewaretheicespiders t1_jdneo2a wrote

The cost of inference, in GPU and thus electric power, of these LLM is just too high. A 8.5 billion searches a day, replacing google search with GPT4 would consume an estimated 7 billion watt hours. A day. Just for the power consumed by the GPUs.

You would need over 638 hoover dams just to power that.

4

vitalyc t1_jdo3iji wrote

So how are people running LLMs locally on laptops and phones? It seems the training costs are unimaginable but you can optimize the models to run on consumer hardware.

0

Bewaretheicespiders t1_jdokdmg wrote

They arent running GPT4 locally, it sends the request through an API.

GPT3 has 175 billion parameters, at float16 thats 326 gigabyte just for the parameters. That would fill most phone's storage, not to mention the 12 gig of ram the most expensive phones have.

Then GPT4 is many times that...

1