ilkamoi t1_j1ylxgy wrote on December 28, 2022 at 10:32 AM

#1,112,372

I'm surprised it hasn't happened yet.

lambolifeofficial OP t1_j1ymsc1 wrote on December 28, 2022 at 10:44 AM

#1,112,465

Yeah ripping off Google's open code and then keeping theirs closed. Not a good precedent for OpenAI. Still, I have a feeling open source will beat closed eventually.

[deleted] t1_j1ynw6l wrote on December 28, 2022 at 10:59 AM

#1,112,625

Replying to lambolifeofficial (#1,112,465)

[deleted]

ashareah t1_j1yszbn wrote on December 28, 2022 at 12:04 PM

#1,113,366

I mean fuck scientific and societal progress right.

[deleted] t1_j1z9toh wrote on December 28, 2022 at 2:46 PM

#1,116,311

[removed]

ebolathrowawayy t1_j1zpmyo wrote on December 28, 2022 at 4:38 PM

#1,119,474

Replying to lambolifeofficial (#1,112,465)

I don't see how unless open source figures out a way to distribute training across machines which afaik is incredibly inefficient/impossible right now. It seems that most progress is due to testing out ideas on $100mil worth of hardware iteratively. Oh and also having massive data on a scale that open source will never have access to.

ixfd64 t1_j1zqf4m wrote on December 28, 2022 at 4:43 PM

#1,119,598

Replying to lambolifeofficial (#1,112,465)

"Open"AI seems anything but.

4e_65_6f t1_j1zqmmi wrote on December 28, 2022 at 4:45 PM

#1,119,646

Yeah like I said in another post, under capitalism it's likely that some company seeks complete monopoly of the labor market before we can all have access to the benefits of AGI. There's no good reason to release your model if it's much better than the current competition if you're a company.

I think this hasn't happened yet because they don't have AGI yet, they'll likely keep it open to the public in case anyone figures out how to advance the research and release it as an open source project so they can copy again.

onyxengine t1_j1zsjhj wrote on December 28, 2022 at 4:57 PM

#1,120,023

Replying to ebolathrowawayy (#1,119,474)

Distributed training will get better

ThePlanckDiver t1_j1zz5ah wrote on December 28, 2022 at 5:41 PM

#1,121,220

Ah, yes, because thus far Google/DeepMind have released all of their advanced models such as LaMDA, PaLM, Imagen, Parti, Chinchilla, Gopher, Flamingo, Sparrow, etc. etc.

>[...] a shift towards secrecy and aggressive competition could significantly hinder the pace of innovation.

Or, you know, competition might lead to transforming (no pun intended) these research artifacts into useful products? Google's Code Red sounds like good news to me as an end-user.

What a nonsense article that seems written with the sole intent to shoehorn an ex-Googler's new startup into a post.

icest0 t1_j1zz5n0 wrote on December 28, 2022 at 5:41 PM

#1,121,222

Replying to lambolifeofficial (#1,112,465)

>I have a feeling open source will beat closed eventually.

no ways that happens. If the open source is so good, just fork it, and add your company's secret sauce.

visarga t1_j202im8 wrote on December 28, 2022 at 6:03 PM

#1,121,859

> Co-founder of Neeva

Ok, so direct competition for search is commenting on Google. Maybe they want to imply they also have a language model that is special and closed, and worthy of receiving investments.

I don't believe what he says, there are no signs of that happening. On the contrary, it would seem the head of the pack is just 6-12 months ahead. Everything trickles down pretty quickly. There are still many roadblocks to AGI and no lab is within striking distance.

We already have nice language models, now we need something else - validation systems. So we can use our language models without worrying they would catastrophically hallucinate or miss a trivial thing. We want to keep the useful 90% and drop the bad 10%. It is possible to integrate web search, knowledge bases and python code execution into the model to keep it from messing up. This is what I see ahead, not the end of open research.

blueSGL t1_j20lqfi wrote on December 28, 2022 at 8:09 PM

#1,125,224

Replying to ThePlanckDiver (#1,121,220)

Google is likely the best positioned for dataset creation.

Text
>google search/cache/AMP links/youtube subtitles

Image
>Image search thumbnails/reCAPTCHA/youtube

Video
>Youtube

They can afford to give away research because I doubt many can match them on shear dataset scale alone.

footurist t1_j20xxwf wrote on December 28, 2022 at 9:29 PM

#1,127,390

Replying to visarga (#1,121,859)

I highly doubt this validation route would go nearly as smooth as the path hereto. I mean the very root cause for GPT messing up so often and in such strange ways is that there's no real reasoning there, only surprisingly well working emulation of reasoning.

However, for validation this emulated reasoning won't nearly cut it. So you end up where you started : finding architectures that can actually reason, which of course nobody knows...

If you were thinking about something like trying to match its responses to similar "actual" search results and then validating via comparison to that : What mechanism to use? Because this seems to require actual reasoning aswell.

treedmt t1_j218w1d wrote on December 28, 2022 at 10:45 PM

#1,129,267

Replying to blueSGL (#1,125,224)

This is an interesting point. Do you think the dataset that google has is high quality enough to actually train ai? In particular, search queries etc aren’t mapped to specific answers to be useful for supervised learning. Maybe I’m missing something?

treedmt t1_j21925t wrote on December 28, 2022 at 10:46 PM

#1,129,287

Replying to icest0 (#1,121,222)

Open source code could still win, if the secret sauce lies in a massive closed source dataset.

treedmt t1_j219ah9 wrote on December 28, 2022 at 10:48 PM

#1,129,326

Replying to visarga (#1,121,859)

Could better, larger datasets be solution to the hallucination problem? Ref chinchilla for example- but maybe even an order of magnitude bigger than that?

blueSGL t1_j21ay8x wrote on December 28, 2022 at 10:59 PM

#1,129,565

Replying to treedmt (#1,129,267)

LLMs where it's a statistical likelihood for next token prediction benefit from more data.

That along with the truism

"You always find things in the last place you look"

can be very powerful tools.

There will be some correlation between search term and result otherwise search would be pointless. That on a large enough scale can sift signal from noise, not only in terms of search results but in delta between individual search terms.

Smellz_Of_Elderberry t1_j21fww7 wrote on December 28, 2022 at 11:34 PM

#1,130,422

Replying to icest0 (#1,121,222)

There are models which require your new iteration to also be open source.

lambolifeofficial OP t1_j21hs0d wrote on December 28, 2022 at 11:47 PM

#1,130,719

Replying to icest0 (#1,121,222)

Elon open sources Tesla and SpaceX tech and yet they're doing better than their competitors

lambolifeofficial OP t1_j21m10i wrote on December 29, 2022 at 12:17 AM

#1,131,452

Replying to 4e_65_6f (#1,119,646)

Elon open-sources Tesla and SpaceX tech yet those companies are doing better than others. "Patents are for the weak", he said. I just wish he would slap some sense into Sam Altman like stop being weak Sam. They both co-founded the company

4e_65_6f t1_j21o0zm wrote on December 29, 2022 at 12:32 AM

#1,131,810

Replying to lambolifeofficial (#1,131,452)

In the wiki for openAI says gpt started when a researcher who isn't even an openAI contributor, a guy named Alec Radford posted a paper to the openAI forums. If the wiki info is correct it sounds like open discussion about the project is what got them there in the first place because it doesn't look like he was even an employee.

lambolifeofficial OP t1_j22beim wrote on December 29, 2022 at 3:26 AM

#1,135,915

Replying to 4e_65_6f (#1,131,810)

You mean this guy? https://openai.com/blog/authors/alec/

4e_65_6f t1_j22dblk wrote on December 29, 2022 at 3:41 AM

#1,136,250

Replying to lambolifeofficial (#1,135,915)

Yeah that's the name credited on the wiki.

lambolifeofficial OP t1_j22e3s4 wrote on December 29, 2022 at 3:48 AM

#1,136,369

Replying to 4e_65_6f (#1,136,250)

do you know the link or where to find that wiki?

4e_65_6f t1_j22eyei wrote on December 29, 2022 at 3:54 AM

#1,136,515

Replying to lambolifeofficial (#1,136,369)

https://en.wikipedia.org/wiki/OpenAI

There you go. It's under the gpt section in the middle.

spottiesvirus t1_j22h1in wrote on December 29, 2022 at 4:12 AM

#1,136,869

Replying to lambolifeofficial (#1,130,719)

Tesla on self driving is far behind waymo and other competitors

SpaceX has no real competitors as the sector is basically kept afloat by nasa contracts and government subsidies

lambolifeofficial OP t1_j22hbyh wrote on December 29, 2022 at 4:14 AM

#1,136,911

Replying to 4e_65_6f (#1,136,515)

thanks bud

Artanthos t1_j22j7xj wrote on December 29, 2022 at 4:30 AM

#1,137,240

Replying to ashareah (#1,113,366)

Progress for who?

The shift will have winners and losers, just like any other competition.

Artanthos t1_j22jeyz wrote on December 29, 2022 at 4:32 AM

#1,137,280

Replying to spottiesvirus (#1,136,869)

It wasn’t a lack of effort on Bezo’s side to win those contracts.

Artanthos t1_j22jjeb wrote on December 29, 2022 at 4:33 AM

#1,137,301

Replying to onyxengine (#1,120,023)

Will it get better faster than the closed systems improve?

enilea t1_j23164y wrote on December 29, 2022 at 7:40 AM

#1,140,119

Replying to lambolifeofficial (#1,112,465)

Google doesn't release most of their models open source, it just releases some. OpenAI does the same, releases some projects but keeps the bigger ones closed source.

sentrux t1_j233d5z wrote on December 29, 2022 at 8:09 AM

#1,140,487

Imagine developing a power that will eventually be used against you. There are.. nations that do not care about patents or I.P.

I would be bummed if you put billions into research and development just to see someone else taking a run with it.

onyxengine t1_j23qkpz wrote on December 29, 2022 at 1:05 PM

#1,143,999

Replying to Artanthos (#1,137,301)

Eventually yes it will become way faster than “closed systems”, because it will be in the cloud on the best machines. Cloud hosting services are clearly incentivized to make distributed training for open source communities affordable and accessible.

Artanthos t1_j24dri4 wrote on December 29, 2022 at 4:05 PM

#1,148,671

Replying to onyxengine (#1,143,999)

The cloud? That is owned and operated by the companies developing the closed systems?

The best machines? While competing against companies with billions of dollars of dedicated funding?

You need to reevaluate your logic.

treedmt t1_j28o94z wrote on December 30, 2022 at 1:19 PM

#1,179,296

Replying to blueSGL (#1,129,565)

Surely there’s some trade off between qualitative vs quantitative data?

Eg. 50 billion high quality QA pairs may beat 500B random google queries as training data.

visarga t1_j2axzal wrote on December 30, 2022 at 10:31 PM

#1,199,156

Replying to treedmt (#1,129,326)

There are approaches to combine multiple stages of language modelling and retrieval. Demonstrate Search Predict: Composing retrieval and language models for knowledge intensive NLP.

This paper is very interesting. They don't create or fine-tune new models. Instead they create sophisticated pipelines of language models and retrieval models. They even publish a new library and show this way of working with LMs.

Practically, by combining retrieval with language modelling it is possible to verify against references. The ability to freely combine these transformations opens up the path to consistency verification. A LM could check itself for contradictions.

ChatGPT Could End Open Research in Deep Learning, Says Ex-Google Employee

Comments

ilkamoi t1_j1ylxgy wrote on December 28, 2022 at 10:32 AM

lambolifeofficial OP t1_j1ymsc1 wrote on December 28, 2022 at 10:44 AM

[deleted] t1_j1ynw6l wrote on December 28, 2022 at 10:59 AM

ashareah t1_j1yszbn wrote on December 28, 2022 at 12:04 PM

[deleted] t1_j1z9toh wrote on December 28, 2022 at 2:46 PM

ebolathrowawayy t1_j1zpmyo wrote on December 28, 2022 at 4:38 PM

ixfd64 t1_j1zqf4m wrote on December 28, 2022 at 4:43 PM

4e_65_6f t1_j1zqmmi wrote on December 28, 2022 at 4:45 PM

onyxengine t1_j1zsjhj wrote on December 28, 2022 at 4:57 PM

ThePlanckDiver t1_j1zz5ah wrote on December 28, 2022 at 5:41 PM

icest0 t1_j1zz5n0 wrote on December 28, 2022 at 5:41 PM

visarga t1_j202im8 wrote on December 28, 2022 at 6:03 PM

blueSGL t1_j20lqfi wrote on December 28, 2022 at 8:09 PM

footurist t1_j20xxwf wrote on December 28, 2022 at 9:29 PM

treedmt t1_j218w1d wrote on December 28, 2022 at 10:45 PM

treedmt t1_j21925t wrote on December 28, 2022 at 10:46 PM

treedmt t1_j219ah9 wrote on December 28, 2022 at 10:48 PM

blueSGL t1_j21ay8x wrote on December 28, 2022 at 10:59 PM

Smellz_Of_Elderberry t1_j21fww7 wrote on December 28, 2022 at 11:34 PM

lambolifeofficial OP t1_j21hs0d wrote on December 28, 2022 at 11:47 PM

lambolifeofficial OP t1_j21m10i wrote on December 29, 2022 at 12:17 AM

4e_65_6f t1_j21o0zm wrote on December 29, 2022 at 12:32 AM

lambolifeofficial OP t1_j22beim wrote on December 29, 2022 at 3:26 AM

4e_65_6f t1_j22dblk wrote on December 29, 2022 at 3:41 AM

lambolifeofficial OP t1_j22e3s4 wrote on December 29, 2022 at 3:48 AM

4e_65_6f t1_j22eyei wrote on December 29, 2022 at 3:54 AM

spottiesvirus t1_j22h1in wrote on December 29, 2022 at 4:12 AM

lambolifeofficial OP t1_j22hbyh wrote on December 29, 2022 at 4:14 AM

Artanthos t1_j22j7xj wrote on December 29, 2022 at 4:30 AM

Artanthos t1_j22jeyz wrote on December 29, 2022 at 4:32 AM

Artanthos t1_j22jjeb wrote on December 29, 2022 at 4:33 AM

enilea t1_j23164y wrote on December 29, 2022 at 7:40 AM

sentrux t1_j233d5z wrote on December 29, 2022 at 8:09 AM

onyxengine t1_j23qkpz wrote on December 29, 2022 at 1:05 PM

Artanthos t1_j24dri4 wrote on December 29, 2022 at 4:05 PM

treedmt t1_j28o94z wrote on December 30, 2022 at 1:19 PM

visarga t1_j2axzal wrote on December 30, 2022 at 10:31 PM