Comments

You must log in or register to comment.

Pro_RazE OP t1_j7nj2vi wrote

It's crazy all this happened in a week, I have missed/not added hundreds of papers and probably missed a lot of other updates so this isn't everything.

55

Sashinii t1_j7no4pu wrote

AI is already changing the world and we're not even at proto-AGI yet.

85

Virtafan69dude t1_j7of7pw wrote

Insane. I hope LaMDA comes with a "safe search off" equivalent.

18

grossexistence t1_j7ofvq1 wrote

At this rate, Proto-AGI will be here by the end of the year or the first half of 2024.

24

turnip_burrito t1_j7okin2 wrote

Probably a box.

Maybe painted black.

And able to understand enough concepts to write improved versions of some of its own code of we asked it to.

Maybe can write some new math proofs in a short and human readable way.

Maybe multimodal.

Large short term memory context window.

Able to update its model in real time for incoming new information.

Maybe running on more specialized hardware, or neuromorphic chips.

19

proteo73 t1_j7okv5z wrote

Ok i need to contact my Planet ...

4

Glad_Laugh_5656 t1_j7oo5q1 wrote

To play devils advocate, you can compile a list of AI achievements (even if not as impressive as this one) this long every week, and knowing that dampens the impressive-ness of this list just a bit.

Not to mention the list is seriously inflated by headlines that aren't actually advancements.

Not to say that we didn't see progress last week (of course we did), but I kinda get the feeling you're making it seem bigger than what it actually was.

7

squareOfTwo t1_j7ooiz6 wrote

just no, the rate is still to damn slow for that. Most of the "progress" is just training with yet unused data (human written text for GPT, Text-Image pairs for the stable diffusions of this world etc. This will end soon if no high quality data is left to train). The end of "scale" is near.

11

enkae7317 t1_j7opwl8 wrote

Great list but curious if you have either the article link or some sort of sources for all of these.

3

Iunaml t1_j7oum8l wrote

2023? Trillions parameters neural networks?

And we get a god damn JPEG of a bulletpoint list upvoted here?

A bulletpoint list that's litterally the list of the title of the most upvoted threads of last week??

What kind of dystopia are we already in?

5

challengethegods t1_j7ovaa4 wrote

You got something against jpegs?
- Over 1 million researchers have used Deepmind's Alphafold Protein Structure Database
- Google Al releases the Flan T5 Language Model Collection
- Meta Al trained blind Al agents that can navigate similar to blind humans
- ChatGPT Plus announced for $20 per month with waitlist (US only for now)
- ChatGPT Users Topped 100 Million in January
- Microsoft announces Teams Premium powered by GPT-3.5
- Perplexity Ask (Al Search Engine) available as a Chrome extension
- Microsoft boosts Viva Sales with new GPT seller experience (integration)
- AudioLDM Text to Audio Generation available on Huggingface to use
- Meta releases a 30B param “OPT+IML” model fine tuned on 2000 tasks
- Google Al Open Sourced Vizier: a scaled blackbox optimization system
- Dreamix: Video Diffusion Models are General Video Editors
- SceneDreamer: Generating 3D Scenes From 2D Image Collections
- SceneScape: Text-Driven Consistent Scene Generation
- RobustNeRF: Basically improves quality of NeRFs
- OpenAl's New Paper: A proof of concept for using Al-assisted human feedback to scale the supervision of ML systems
- Deepmind Paper: Accelerating Large Language Model Decoding with Speculative Sampling (2-2.5x speedup)
- Amazon Al: Multimodal-CoT outperforms GPT-3.5 by 16% (75.17% -> 91.68%) on ScienceQA and even surpasses human
performance
- Sundar Pichai announced: LaMDA language model within "coming weeks and months”
- AutumnSynth synthesizes the source code of a 2D video game from seconds of play
- Nvidia Paper: Enabling Simulated Characters To Perform Scene Interaction Tasks In Natural/Lifelike Manner
- Poe, a ChatGPT like bot launched from the creators of Quora. They are also making API for it. Currently iOS only.
- Google invests $300 million in Anthropic Al (Done in 2022, reported now)
- BLIP-2 demo available on Huggingface: LLM that can understand images
- Humata.ai launched: Basically ChatGPT for your own files
- Bing + GPT integration images leaked
- Google's new Real-time tracking of wildfire boundaries using satellite imagery
- LAION Al introduces Open Assistant: Chatbot project that understands tasks, interacts with third-party systems, and retrieve
information dynamically (open source)
- Apple CEO Tim Cook says Al will eventually ‘affect every product and service we have'
- Epic-Sounds: A Large-scale Dataset of Actions That Sound Released
- announcing stable attribution - a tool which lets anyone find the human creators behind a.i generated images
- presenting TEXTure, a novel method for text-guided generation, editing, and transfer of textures for 3D shapes
- Tune-A-Video available to use and also open sourced (turns Al Generated Images into gifs or videos)
- Filechat.io now available - ChatGPT for your own data and no limits (with premium tier)
- BioGPT-Large by Microsoft now available on Huggingface to try
- Google announces Bard, powered by LaMDA coming soon as an Al conversational service. It will be integrated with Search.
- Microsoft announces surprise event for tomorrow with Bing ChatGPT expected (Feb 7)
- Language Models Secretly Perform Gradient Descent as Meta-Optimizers Paper - In-context-learning, the ability for LLMs to
learn new abilities from examples in a prompt alone
- Apple to hold in-person ‘Al summit’ event for employees at Steve Jobs Theater
- Seek Al introduces DeepCuts, the Al SQL app that lets you explore your Spotify data with natural language
- KickResume's Al Resume Builder can rewrite, format, and grade a resume
- Introducing Polymath: The open-source tool that converts any music-library into a sample-library with machine learning
- Microsoft & OpenAl: Bing and Edge + Al: a new way to search starts today
- some guy used his self-programming discord bot to grab this list from a jpeg
ftfy

12

zendonium t1_j7ovw92 wrote

But surely that's all it takes? The human brain is just a multimodal network that processes language, visual, audio, and a bunch of other stuff.

Pay 10,000 Kenyans $2 a day to get more training data on more senses and train more networks. We'll have narrow AGIs in almost all areas. Just needs putting together with some clever insight from some genius.

6

Pro_RazE OP t1_j7ox1lr wrote

Let's say you generate an image of a cat. CLIP can convert what is in the image into words and then use them to find similar images that was in the LAION dataset which Stable Diffusion uses. So if it was let's say an orange cat, it can find similar images of that cat that was used in the training. Without those original pictures, Stable Diffusion cannot generate pictures of an orange cat (poor example i know lol). It is not always accurate. And also generated images are always different to the original ones. But one recent paper kinda proved it wrong (very rarely happens)

I hope this helps.

3

r0cket-b0i t1_j7ox3c3 wrote

Misleading, this is similar to tracking product progress by the number of features shipped. Any crypto startup still alive can give you 20 items bullet list of things done past month, but how much of that actually changed a thing....

Unfortunately on this list only the Alpha Fold item is significant

−1

challengethegods t1_j7oxmgv wrote

Personally I think the jpeg is more useful than text, but I can just as easily convert the text to jpeg so realistically idgaf - it does make it easier to save/share as jpeg, but slightly harder to copy/paste specific lines from it for a search, as example. pro/con I guess, but also as a jpeg the entire list shows from any view, meaning the "look at this big list" aspect is clarified regardless if someone cares to read past the first few lines. However, a jpeg is not as easily indexed by crawlerbots, which might have some unintended effects down the line. On the other hand, a jpeg can have any background color and select its own font which allows its creator to have greater control over the way that it's viewed, but this could be seen as a downside for someone that does not agree with their artistic vision. That being said, a jpeg also has the benefit of...
[I can do this forever lol]

1

InitialCreature t1_j7p0tmx wrote

I dunno, being able to synthesize game code from a few seconds of play is huge, someone likes how your guns handle in your shooter can take that and tweak it for their own game. Are we trademarking game mechanics yet?

3

Cryptizard t1_j7p26uc wrote

If that was true then we could just train a model on all the AI research we have and get a “narrow AGI” that makes AI models. Singularity next week. Unfortunately, that is not how it is.

4

Baturinsky t1_j7p2ih7 wrote

Could you please do this as a text with references?

2

controltheweb t1_j7paze2 wrote

Image to Text:

Al Progress of February, 2023 Week 1 (1 Feb - 7 Feb) by pro_raze

  1. Over 1 million researchers have used Deepmind's Alphafold Protein Structure Database
  2. Google Al releases the Flan T5 Language Model Collection
  3. Meta Al trained blind Al agents that can navigate similar to blind humans
  4. ChatGPT Plus announced for $20 per month with waitlist (US only for now) - ChatGPT Users Topped 100 Million in January
  5. Microsoft announces Teams Premium powered by GPT-3.5
  6. Perplexity Ask (Al Search Engine) available as a Chrome extension
  7. Microsoft boosts Viva Sales with new GPT seller experience (integration)
  8. AudioLDM Text to Audio Generation available on Huggingface to use
  9. Meta releases a 30B param "OPT+IML" model fine tuned on 2000 tasks
  10. Google Al Open Sourced Vizier: a scaled blackbox optimization system
  11. Dreamix: Video Diffusion Models are General Video Editors
  12. SceneDreamer: Generating 3D Scenes From 2D Image Collections
  13. SceneScape: Text-Driven Consistent Scene Generation
  14. RobustNeRF: Basically improves quality of NeRFs
  15. OpenAl's New Paper: A proof of concept for using Al-assisted human feedback to scale the supervision of ML systems
  16. Deepmind Paper: Accelerating Large Language Model Decoding with Speculative Sampling (2-2.5x speedup)
  17. Amazon Al: Multimodal-CoT outperforms GPT-3.5 by 16% (75.17% -> 91.68%) on ScienceQA and even surpasses human performance
  18. Sundar Pichai announced: LaMDA language model within "coming weeks and months"
  19. AutumnSynth synthesizes the source code of a 2D video game from seconds of play
  20. Nvidia Paper: Enabling Simulated Characters To Perform Scene Interaction Tasks In Natural/Lifelike Manner
  21. Poe, a ChatGPT like bot launched from the creators of Quora. They are also making API for it. Currently iOS only.
  22. Google invests $300 million in Anthropic Al (Done in 2022, reported now)
  23. BLIP-2 demo available on Huggingface: LLM that can understand images
  24. Humata.ai launched: Basically ChatGPT for your own files
  25. Bing+ GPT integration images leaked
  26. Google's new Real-time tracking of wildfire boundaries using satellite imagery
  27. LAION AI introduces Open Assistant: Chatbot project that understands tasks, interacts with third-party systems, and retrieve information dynamically (open source)
  28. Apple CEO Tim Cook says Al will eventually 'affect every product and service we have'
  29. Epic-Sounds: A Large-scale Dataset of Actions That Sound Released
  30. announcing stable attribution - a tool which lets anyone find the human creators behind a.i generated images
  31. presenting TEXTure, a novel method for text-guided generation, editing, and transfer of textures for 3D shapes
  32. -Tune-A-Video available to use and also open sourced (turns Al Generated Images into gifs or videos)
  33. Filechat.io now available - ChatGPT for your own data and no limits (with premium tier)
  34. BioGPT-Large by Microsoft now available on Huggingface to try
  35. Google announces Bard, powered by LaMDA coming soon as an Al conversational service. It will be integrated with Search.
  36. Microsoft announces surprise event for tomorrow with Bing ChatGPT expected (Feb 7)
  37. Language Models Secretly Perform Gradient Descent as Meta-Optimizers Paper - In-context-learning, the ability for LLMs to learn new abilities from examples in a prompt alone
  38. Apple to hold in-person 'Al summit' event for employees at Steve Jobs Theater
  39. -Seek Al introduces DeepCuts, the AI SQL app that lets you explore your Spotify data with natural language
  40. KickResume's Al Resume Builder can rewrite, format, and grade a resume
  41. Introducing Polymath: The open-source tool that converts any music-library into a sample-library with machine learning
  42. Microsoft & OpenAI Announce: Bing and Edge + Al: a new way to search starts today
15

easy_c_5 t1_j7pb4sx wrote

Apparently you haven't seen any of the the uncountable javascript libraries released durint that time, or the hundreds of startups tackling similar subjects or the tens of thousands of research papers on distributed systems, animation etc. (just because there are non-groundbreaking research papers in the list above too) .

The list above is nothing groundbreaking, just copies over copies of the same stuff we've had for quite a while + productivising it.

The real summary of the past months and days:

The good: AI is going public.

The bad: we still don't have any real clue on how to get to AGI.

The worse: AI is getting regulated and people are fighting back.

3

expelten t1_j7pdppo wrote

Yes that's what caught my attention on this list. Consider also this will be applied to any software someday which means everything will be like open source...this would be extremely disruptive.

2

Iunaml t1_j7pexov wrote

Text is more easily searchable, editable and accessible compared to jpeg. Jpeg images are not easily indexed by search engines, which can negatively affect their discoverability. Text can be easily copied, pasted, and edited, while with a jpeg, the text cannot be edited and is limited in terms of accessibility. Additionally, a jpeg may not display correctly on all devices, whereas text can be viewed on any device with a compatible software. These advantages of text make it a preferred format for sharing information over jpeg images.

2

crua9 t1_j7pkknd wrote

Just a heads up, this isn't easy to read and it might be best to copy and paste it into the post.

3

neo101b t1_j7pnq7l wrote

What we had 1 year ago compared to what we have now, is simply amazing.

It feels like AI technology is growing faster than any other technology that's been developed, its going faster than warp 10.

7

BadassGhost t1_j7pov2k wrote

2019 was GPT-2 which rocked the boat. 2020 was GPT-3 which sank the boat. Those were partially responsible for kicking off this whole scaling up of transformers

There was also LaMDA in 2021, and I'm sure many other big events in that period that I'm forgetting

5

BadassGhost t1_j7pprrd wrote

I feel like an unrestricted LLM-powered chatbot is pretty close to proto-AGI. OpenAI is basically lobotomizing ChatGPT to avoid headlines about it claiming sentience or emotions or making controversial statements, so it's not much to go off of.

We haven't been able to play with PaLM or any next-gen versions of it (Flan-PaLM and U-PaLM), but the benchmark comparisons between that and others seem enormous. If you build PaLM with an embedded dataset and cross-attention like Retro, I think that would probably be proto-AGI.

And then the next step from there to actual AGI would be making a multi-modal version of that, like Gato. The only missing ingredient there is getting the model to use one modality to inform about other modalities, which they did not achieve with Gato but are supposedly actively working on

22

kaleNhearty t1_j7purfu wrote

Generative transformer models are not AGI, not even close. We're going to have to come up with some new methodology to handle multi-modality or maybe some kind of synthesis between several different models until we see some kind of Proto-AGI and that's decades away IMO.

2

visarga t1_j7q4313 wrote

If they make GPT-N much larger, it will take longer and cost more to train. Then we can only afford a few trials. Whether they are selected by humans or AI makes little difference. It's going to be a crapshoot anyway, nobody knows what experiment is gonna win. The slow experimentation loop is one reason not even AGI can speed things up everytime.

2

dontpet t1_j7qtlhd wrote

I'm a casual reader on this sub with a systems engineering background and don't understand most of those headlines. I do understand a few and the implications of those are huge.

Let's hope humanity somehow manages to adapt to this.

Our current laws and governance structures are so slow they won't be able to do much. It will be like shooting at a bunny based on where it was a week ago.

2

p3opl3 t1_j7r3693 wrote

That's actually a fair point.. although those models had been invented way before 2019.. release date isn't development or discovery date right. It's like GPT4 ..that's already existed for well over 2 years now right..it's just not "ready" yet.

Stable Diffusion 3 is literally microsecond level response time now.. it's insane.

Honestly.. I think the big breakthroughs.. aren't going to be in AI..it's going to be in UK/UX and how people are going to bootstrap these models for building something actually useful.

0

p3opl3 t1_j7r3sjq wrote

That's a fair point .. I mentioned in my other reply.. releases are an OK indicator of progress through. Technically GPT3 was already well past development stage before 2020...

aaaand DALLE, I don't know how much of an advancement that is.. like it's it not surprising that a tiny startup releasing their first version of Stable Diffusion.. dominated the AI communities.. just because it's open source. There was definatley important releases..

But this year's has literally only seen 6 weeks right? ... Some pretty much moves being made already.. it's exciting.

0

Sotamiro t1_j7s5clz wrote

One day there will be a list this big for a 1 hour period instead of 1 week

3