Recent comments in /f/MachineLearning
light24bulbs t1_jeeb4ag wrote
Reply to comment by ASlowDanceWithDeath in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
I think people don't know if they can legally do this under the llama license.
It's part of why the Lora approach to fine tuning is so nice, you don't have to share the original weights.
light24bulbs t1_jeeb0rs wrote
Reply to comment by Alarming_Turnover578 in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
No, it's a non-commercial license focused on research cases.
tripple13 t1_jeeaumr wrote
Reply to comment by 314kabinet in [D][N] LAION Launches Petition to Establish an International Publicly Funded Supercomputing Facility for Open Source Large-scale AI Research and its Safety by stringShuffle
the AI Diversity, Equity and Inclusiveness community (AI Ethics)
light24bulbs t1_jeeaqrs wrote
Reply to comment by AlmightySnoo in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Agreed. At least this work on top of llama is apache 2.0
ChuckSeven t1_jeeae4o wrote
Reply to comment by MysteryInc152 in [D] Can large language models be applied to language translation? by matthkamis
Look, it doesn't matter. You can't claim that LLM are better if you don't demonstrate it on an established benchmark with a large variety of translations. How should I know if those Japanese anime translations are correct? For what its worth it might be just "prettier" text but a wrong translation.
It's sad to get downvoted on this subreddit for insisting on very basic academic principles.
a_beautiful_rhind t1_jeeab0q wrote
People using creepy newspeak like "pain points".
sparkpuppy t1_jee9qj3 wrote
Reply to comment by Ricenaros in [D] Simple Questions Thread by AutoModerator
Hello, thank you so much for the detailed explanation! Yes, it definitely helps me have a clearer vision of the meaning of that expression. Have a nice day!
light24bulbs t1_jee9lvm wrote
Reply to comment by Tostino in [D][N] LAION Launches Petition to Establish an International Publicly Funded Supercomputing Facility for Open Source Large-scale AI Research and its Safety by stringShuffle
I agree with you. Looking at papers like ToolFormer and so on, we are very close.
We are only a couple years away from AGI, which is what I've been saying for YEARS and getting yelled at here. The WaitButWhy article in 2016 was dead right
light24bulbs t1_jee9eey wrote
Reply to comment by gahblahblah in [D][N] LAION Launches Petition to Establish an International Publicly Funded Supercomputing Facility for Open Source Large-scale AI Research and its Safety by stringShuffle
So you went through and signed it?
light24bulbs t1_jee9dou wrote
Reply to comment by glichez in [D][N] LAION Launches Petition to Establish an International Publicly Funded Supercomputing Facility for Open Source Large-scale AI Research and its Safety by stringShuffle
Nice. You signed it then?
light24bulbs t1_jee9d2e wrote
light24bulbs t1_jee9clk wrote
Reply to [D][N] LAION Launches Petition to Establish an International Publicly Funded Supercomputing Facility for Open Source Large-scale AI Research and its Safety by stringShuffle
Please everyone, go through to the link and sign it
hapliniste t1_jee975h wrote
Reply to comment by Art10001 in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
You can try it in the web demo and to me it seems better than Vicuna. I guess they'll make an anouncement soon
Alarming_Turnover578 t1_jee6v5w wrote
Reply to comment by big_ol_tender in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Isn't Llama licensed under gpl?
MysteryInc152 t1_jee5zba wrote
Reply to comment by ChuckSeven in [D] Can large language models be applied to language translation? by matthkamis
It's not cherry picked lol.
Wild how everyone will just use that word even when they've clearly not tested the supposed model themselves. I'm just showing you what anyone who's actually used these models for translation will tell you
Liverpool67 t1_jee58qc wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Tested the demo but looks like this model is worse than base model with logical questions. Maybe the model useful for basic questions.
Update: Tested with basic questions, still worse than base model (13b)
a_beautiful_rhind t1_jee547c wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
512 context? I used alpaca-native and even llama + alpaca lora on long 2048 context. It worked fine.
>We plan to release the model weights by providing a version of delta weights that build on the original LLaMA weights, but we are still figuring out a proper way to do so.
This is where the weights currently "are".
Also.. do 30b next!
Edit.. playing with the demo: >YOUR INPUT VIOLATES OPENAI CONTENT MODERATION API. PLEASE TRY AGAIN.
And "as a language model" replies.. including about me saying that I want to download the weights. Model says it can't be downloaded and has no "physical form". Ayy lmao.
Please stop training "openAI-isms" into models.
Tostino t1_jee40if wrote
Reply to [D][N] LAION Launches Petition to Establish an International Publicly Funded Supercomputing Facility for Open Source Large-scale AI Research and its Safety by stringShuffle
The possibility of creating an autonomous agent with current level hardware is not as far-fetched as it may seem. A single adept engineer could conceivably construct such an agent by amalgamating insights from disparate papers that have been divulged in the field of artificial intelligence. These papers may contain novel algorithms, techniques, or architectures that could be integrated into a coherent and functional system. Moreover, the open source implements that are available today, such as langchain/flow and pinecone db (or similar), could provide the necessary tools and frameworks to assemble an architecture that is self augmenting and self refining. Such an architecture could leverage the power of distributed computing, natural language processing, and machine learning to improve its own performance and capabilities over time. This could potentially enable the agent to surpass the optimal human capacities at most undertakings, or at least match them.
FermiAnyon t1_jee3nc1 wrote
Reply to comment by turnip_burrito in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
What did you prompt it with? And what do you think of its answer?
hapliniste t1_jee3gvr wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
I tried some things in the web demo and it is really good.
What people haven't realised yet is that Koala (another model they did not publish about for now) is also available in the web demo and it is CRAZY GOOD! It's also really fast because I guess I'm the only one using it right now haha.
I really recommand to try it, it looks like Vicuna is a bit bellow GPT3.5 and Koala a bit above but I did not test it enough to be sure right now.
FermiAnyon t1_jee34lx wrote
Reply to comment by mattsverstaps in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
Glad you're here. This would be a really interesting chat for like a bar or a meetup or stunting ;)
But yeah, I'm just giving my impressions. I don't want to make any claims of authority or anything as I'm self taught with this stuff...
But yeah, I have no idea how our brains do it, but when you're building a model whether it's a neural net or you're just factoring a matrix, you'll end up with a high dimensional representation that'll get used as an input to another layer or that'll just be used straight away for classification. It may be overly broad, but I think of all of those high dimensional representations as embeddings and the dimensionality available for encoding an embedding as the embedding space.
Like if you were into sports and you wanted to organize your room so that distance represents relationships between equipment. Maybe the baseball is right next to the softball and the tennis racket is close to the table tennis paddle, but they're a little farther away from the baseball stuff, then you've got some golf clubs and they're kind of in one area of the room because they all involve hitting things with another thing. Then your kite flying stuff and your fishing stuff and your street luge stuff is kind of as far apart as possible from the other stuff because it's not obvious to me anyway that they're related. Your room is a two dimensional embedding space.
When models do it, they just do it with more dimensions and more concepts, but they learn where to put things so that the relationships are properly represented and they just learn all that from lots of cleverly crafted examples.
light24bulbs t1_jeeb9cx wrote
Reply to comment by gliptic in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Nice way to get around the license problem.
Is Lora really associated with a quality loss? I thought it worked pretty well.