blueSGL

blueSGL t1_jc5s56i wrote

Less than $100 to get this sort of performance out of a 7B parameter model and from the LLaMA paper they stopped training the 7B and 13B parameter models early.

Question is now just how much better can small models get. (lawyer/doctor/therapist in everyone's pocket, completely private?)

15

blueSGL t1_jb6h9jc wrote

>Seeing the large variance in the hardware cost/performance of current models, Id think the progression margin for software optimization alone is huge.

>I believe we already have the hardware required for one ASI.

Yep, how many computational "ah-ha" moment tricks are we away from running much better models on the same hardware.

Look at stable diffusion how the memory requirement fell through the floor. We already are seeing similar with LLaMA now getting into public hands (via links from pull requests on Facesbooks github lol) there are already tricks getting implemented in front ends for LLMs that allow for lower VRAM usage.

13

blueSGL t1_jaq2ray wrote

> compete at the national, or maybe even international level

speed of light hasn't changed. Networks get better throughput but latency remains.

For work where you need to have dexterity and reflexes locally piloted will be better. (though not everything will need that level of feedback)

1

blueSGL t1_ja6pgm2 wrote

Listening to Neel Nanda talk about how models form structures to solve common problems presenting in training, no wonder they are able to pick up on patterns better than humans, that's what they are designed for.

and I believe that training models with no intention of running them purely to see what if any hidden underlying structures humanity has collectively missed is called something like 'microscope AI '

7

blueSGL t1_ja1vmoa wrote

What about this, with the same prompt/model/seed/...'settings'... combination you can pull the same image out of an image model as someone else

I can easily see there be a time where people generate [music/tvshows/movies/etc] themselves but share the created media and have other people vote and rank it.

e.g. head over to a website that hosts ratings for... AI generated Simpsons episodes and share all the 'settings' needed to load into your own system to recreate it.

Then you can brows by popular generated content, circa whatever month you happen to be in, or by all time, or whatever other metrics you can think of.

Everyone has the capability to generate new stuff and then has the ability to share it. Good stuff gets popular and becomes zeitgeist-y for a while, bad stuff just exists.

7

blueSGL t1_ja00p4i wrote

I first saw this mentioned 9 days ago by Gwern in the comment here on LW

>"... a language model is a Turing-complete weird machine running programs written in natural language; when you do retrieval, you are not 'plugging updated facts into your AI', you are actually downloading random new unsigned blobs of code from the Internet (many written by adversaries) and casually executing them on your LM with full privileges. This does not end well."


This begs the question, how are you supposed to sanitize this input whilst still keeping them useful?

9

blueSGL t1_j9rf2n4 wrote

Now come on, be fair. You know that's not the point I'm making at all.

It's people working in ML research being unable to accurately predict technological advancements, not user numbers.

You might find this section of an interview with Ajeya Cotra (of biological anchors for forecasting AI timelines fame)

Starts at 29.14 https://youtu.be/pJSFuFRc4eU?t=1754

Where she talks about how several benchmarks were past early last year that surveys of ML workers had a median of 2026.
Also she casts doubt on people that are working in the field but are not working on specifically forecasting AGI/TAI directly as a source for useful information.

17

blueSGL t1_j9mdxby wrote

Again I think we are running up against a semantics issue.

What percentage of human activity would you need to class the thing as 'general'

Because some people argue anything "below 100%" != 'general' and thus 'narrow' by elimination.

Personally I think it's reasonable if you've loaded a system with all the ways ML works currently/all the published papers and task it with spitting out a more optimal system it just might do so. All without being able to do a lot of the things that would be classed as human level intelligence. There are whole swaths of data concerning human matters that it would not need to train on or that the system would in no way need to be middling-expert at.

6