RuairiSpain t1_jac8up5 wrote on February 28, 2023 at 11:52 AM

Yeah, but given the explosion in AI PR that will change this year and next. Nvidia is about to have a windfall of GPU sales for ChatGPT like training

bamfalamfa t1_jac9pk6 wrote on February 28, 2023 at 12:02 PM

nobody is going to be buying gpus for flimsy ai text training

A-Delonix-Regia t1_jacethn wrote on February 28, 2023 at 12:55 PM

Nearly no one will do that. There are millions of gamers, and what? A few tens of thousands of people who are interested enough in AI content generation to buy a new GPU?

RuairiSpain t1_jacukww wrote on February 28, 2023 at 3:00 PM

ChatGPT has 125 million vocabulary, to hold that in memory you'd need at least 1 80GB nVidia card, at $30,000 each. As AI models grow they'll need more RAM and Cloud is the cheapest way for companies to timeshare those prices.

It's not just training the models, it's also query the models that need that in memory calculations. I'm not expecting gamer to buy these cards. But scale up the number of using going to query OpenAI, Bing X ChatGPT or Google x Bard, and all the other AI competitors and there will be big demand for large RAM GPUs

nerd4code t1_jado5gs wrote on February 28, 2023 at 6:11 PM

GPUs are in general way beyond overkill for NNs, which is what you’re talking about. NNs can use the massive data-parallelism and linear-algebraic trickery offered by GPUs, but the data format you use tends to hit a sweet spot right around 8-bit floating-point, and video cards tend to focus on 16+-bit, us. with the ability to do 32-/64-bit f.p. and 32-/64-bit integers also—units and busses for which will at the very least eat power. Newer NVidia cards do have TPUs attached so they can do 8-bit stuff without un- & re-packing, but that’s a comparatively tiny afterthought to the card’s design, and atl afaihs the TPU is usually shared between pairs of thread-XUs.

What you’d really want is to focus on, say, 32-bit integer add/sub/deref and 8-bit f.p. MACs in their own, non-shared units/lanes, and any special accel you can do for convolution will help some also. Which is why TPUs as a standalone thing exist.