From Yann Lecun, one of the central figures in the ML field, who's also the "Chief AI Scientist" at Meta
Submitted by starstruckmon t3_121kkcq in singularity
Reply to comment by Taenk in [Research] Alpaca 7B language model running on my Pixel 7 by simpleuserhere
🤷
Sometimes models just come out crap. Like BLOOM which has almost the same number of parameters as GPT3, but is absolute garbage in any practical use case. Like a kid from two smart parents that turns out dumb. Just blind chance.
Or they could be wrong. 🤷
Reply to comment by yaosio in [P] The next generation of Stanford Alpaca by [deleted]
It's not about copyright
https://www.reddit.com/r/MachineLearning/comments/11v4h5z/-/jct0s11
Reply to comment by throwaway957280 in [P] The next generation of Stanford Alpaca by [deleted]
They are. It's less to do with copyright and more to do with the fact that you signed the T&C before using their system ( and then broke ). It's simmilar to the LinkedIn data scraping case where the court ruled that it wasn't illegal for them to scrape ( nor did it violate copyright ) but they still got in trouble ( and had to settle ) because of violating the T&C.
One way around this is to have two parties, one generating and publishing the dataset ( doesn't violate T&C ) and another independant party ( who didn't sign the T&C ) fine-tuning a model on the dataset.
Reply to comment by A1-Delta in [P] The next generation of Stanford Alpaca by [deleted]
Just publish the diff. between the original model and the finetuned model. That's what a lot of people are doing to avoid any license issues.
Reply to [P] The next generation of Stanford Alpaca by [deleted]
There's a already a couple high quality instruction datasets/compilations like FLAN that I think should also be mixed in.
Be sure to check the generated dataset for issues. Might require some cleanup like the original did.
Reply to comment by Damitrix in [P] The next generation of Stanford Alpaca by [deleted]
He's using the actual API.
Reply to comment by Taenk in [Research] Alpaca 7B language model running on my Pixel 7 by simpleuserhere
I've heard from some experienced testers that the 33B model is shockingly bad compared to even the 13B one. Despite what the benchmarks say. That we should either use the 65B one ( very good apparently ) or stick to 13B/7B. Not because of any technical reason but random luck/chance involved with training these models and the resultant quality.
I wonder if there's any truth to it. If you've tested it yourself, I'd love to hear what you thought.
Reply to comment by timedacorn369 in [Research] Alpaca 7B language model running on my Pixel 7 by simpleuserhere
You can see some benchmarks here
Reply to comment by SaifKhayoon in [R] We found nearly half a billion duplicated images on LAION-2B-en. by von-hust
No
Yes
Reply to comment by Spire_Citron in Likelihood of OpenAI moderation flagging a sentence containing negative adjectives about a demographic as 'Hateful'. by grungabunga
He's talking about the human preference data used for RHLF fine-tuning ( which is what makes ChatGPT from GPT3 ). It's not really that massive.
Reply to comment by jturp-sc in [N] Microsoft announces new "next-generation" LLM, will be integrated with Bing and Edge by currentscurrents
Looks so dated..
Reply to comment by farmingvillein in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
Fair enough. I was speaking from a practical perspective, considering the types of questions that people typically ask search engines, not benchmarks.
Reply to comment by farmingvillein in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
It's not just better, wrong information from these models is pretty rare, unless the source it is retrieving from is also false. The LM basically just acts as a summary tool.
I don't think it needs to be 100% resolved for it to be a viable replacement for a search engine.
Reply to comment by st8ic in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
Retrieval augmented models ( whether via architecture or prompt ) don't have that issue.
Even GPT3 API based services like perplexity.ai that retrieval augment using just the prompt don't spew wrong information all that much.
Reply to comment by Surur in What will happen to the Amish people when the singularity happens? by uswhole
Depends on which Amish faction/group we're talking about. It's not so much that they can't use technology, but more that they can't be connected to the outside/grid. So some groups even use solar panels and battery powered electrical equipment. They also have fridges that run on gas.
Reply to Possible first look at GPT-4 by tk854
Looks like perplexity.ai , but better ( maybe? )
Doesn't feel very GPT4. Seems like simple retrieval augmentation.
Reply to comment by Flaky_Preparation_50 in Former Instagram Co-Founders Launch AI-Powered Personalized News App, Artifact by Flaky_Preparation_50
It's not really the colours. It's the layout. Feels dated and blog spammy. There's some fundamental rules of modern responsive website design it breaks such as not putting borders on the side. Especially thick ones.
I don't think it's something you'll be able to fix in a second. Take some time with it.
Reply to comment by Flaky_Preparation_50 in Former Instagram Co-Founders Launch AI-Powered Personalized News App, Artifact by Flaky_Preparation_50
It's fine. I think the main issue is I'm on mobile right now. Desktop website is better.
Reply to comment by Flaky_Preparation_50 in Former Instagram Co-Founders Launch AI-Powered Personalized News App, Artifact by Flaky_Preparation_50
Good. The format is perfect. All that needs work is the visual design of the site, and automating this to work on a large scale ( I don't know how much of that is automated ).
Reply to comment by SuddenlyBANANAS in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips
>find exact duplicates of images which were not in the training data, you'd have a point
The process isn't exactly the same, but isn't this how all the diffusion based editing techniques work?
Reply to comment by pm_me_your_pay_slips in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips
From paper
>Our attack extracts images from Stable Diffu- sion most often when they have been duplicated at least k = 100 times
for the 100 number. The 10 is supposed to be the number of epochs, but I don't think it was trained on that many epochs. More like 5 or so ( you can look at the model card ; it's not easy to give an exact number ).
Reply to comment by IDoCodingStuffs in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips
They also manually annotated the top 1000 results, adding only 13 more images. The number you're replying to counted those.
starstruckmon t1_jd2hzl6 wrote
Reply to comment by Carrasco_Santo in [P] OpenAssistant is now live on reddit (Open Source ChatGPT alternative) by pixiegirl417
It's unlikely that the main problem is the RHLF data and not the base model.