pasr9 t1_jecwbqm wrote on March 31, 2023 at 2:19 AM

Reply to comment by ktpr in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679

AI output is not currently copyrightable in the US.

machineko t1_jecw2v4 wrote on March 31, 2023 at 2:17 AM

Reply to comment by darkbluetwilight in [D]Suggestions on keeping Llama index cost down by darkbluetwilight

Cerebras-GPT models are Apache-2.0. You should be able to use them for free. Not sure what you mean by charges. Are you referring to using the hosted APIs?

Btw, you should use the ones that are instruction fine-tuned.

machineko t1_jecvhyt wrote on March 31, 2023 at 2:12 AM

Reply to comment by Evening_Ad6637 in [D] Training a 65b LLaMA model by Business-Lead2679

16gb of RAM is not enough for even the smallest LLaMA 7b model. You can try doing LoRA with int8 listed above. Did you try the python script I linked above?

cthorrez t1_jecv5ox wrote on March 31, 2023 at 2:09 AM

Reply to [D] What are your top 3 pain points as an ML developer in 2023? by General-Wing-785

Code quality of my own and my team's code

The reliability of the engineering platforms we use. (spark, gpu clusters, ci/cd build pipelines)

The correctness and completeness of the data we ingest

AlmightySnoo t1_jecum2v wrote on March 31, 2023 at 2:05 AM

Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679

I think this sub should start enforcing the explicit mention of "NOT FREE (AS IN FREEDOM)" in the title and/or flair when people use the word "open-source" when there are restrictions in place. Yes technically there's no lie, but it's still misleading (often intentionally) since many conflate open-source with free software (proof in the comments when you have people asking about it). We should be discouraging this trend of "Smile! You should be happy I'm showing you the code, but you should only use it the way I tell you to" that OpenAI started, it's a huge regression and it feels like we're back to the dark days before the GPL.

Art10001 t1_jecukzj wrote on March 31, 2023 at 2:05 AM

Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679

A Koala model is listed on the site. What is it?

Purplekeyboard t1_jecuaja wrote on March 31, 2023 at 2:03 AM

Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679

>Relative Response Quality Assessed by GPT-4

There's no way Bard is 93% as good as ChatGPT. Bard is dumb as hell, comparatively.

phire t1_jects6y wrote on March 31, 2023 at 1:59 AM

Reply to comment by ktpr in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679

It gets a bit more complicated.

OpenAI can't actually claim copyright on the output of ChatGPT, so licensing something trained on ChatGPT output as MIT should be fine from a copyright perspective. But OpenAI do have terms and conditions that forbid using ChatGPT output to train an AI... I'm not sure how enforceable that is, especially when people put ChatGPT output all over the internet, making it near impossible to avoid in a training set.

As for retraining the LLaMA weights... presumably Facebook do hold copyright on the weights, which is extremely problematic for retraining them and relicensing them.

kulchacop t1_jectkc3 wrote on March 31, 2023 at 1:57 AM

Reply to comment by joondori21 in [R] TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs - Yaobo Liang et al Microsoft 2023 by Singularian2501

TaskMatrix.AI
LangChain
Toolformer
ChatGPT plugins

What else we got?

Art10001 t1_jectiy3 wrote on March 31, 2023 at 1:57 AM

Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679

Soon: Huemul, Pudu, Condor.

itsyourboiirow t1_jecqjqd wrote on March 31, 2023 at 1:34 AM

Reply to comment by Evening_Ad6637 in [D] Training a 65b LLaMA model by Business-Lead2679

Training requires a significant more amount of memory as it it has to keep track of the gradient for every parameter. I would check to see how much memory it takes up on your computer.

itsyourboiirow t1_jecqc1d wrote on March 31, 2023 at 1:32 AM

Reply to comment by Nhabls in [D] Training a 65b LLaMA model by Business-Lead2679

This is the only downside I've found. Sometimes it's too darn hard to find an instance.

General-Wing-785 OP t1_jeconzd wrote on March 31, 2023 at 1:19 AM

Reply to comment by SnooPears7079 in [D] What are your top 3 pain points as an ML developer in 2023? by General-Wing-785

Thank you!

[deleted] t1_jecolnj wrote on March 31, 2023 at 1:19 AM

Reply to [D] What are your top 3 pain points as an ML developer in 2023? by General-Wing-785

[deleted]

ktpr t1_jeco4so wrote on March 31, 2023 at 1:15 AM

Reply to comment by cathie_burry in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679

I feel like a lot of folks are missing this point. They retraining on ChatGPT output or LLaMA related output and assume they can license as MIT or some such.

nmfisher t1_jeco3nx wrote on March 31, 2023 at 1:15 AM

Reply to comment by Business-Lead2679 in [D] Training a 65b LLaMA model by Business-Lead2679

Someone also mentioned https://jarvislabs.ai/ to me the other day, haven't used it myself but it looks promising.

big_ol_tender t1_jecny45 wrote on March 31, 2023 at 1:14 AM

Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679

Stop claiming fine tuned llama models as open source. It’s not open source and we can’t use them for anything real.

MentesInquisitivas t1_jecm73g wrote on March 31, 2023 at 1:00 AM

Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679

Wheres the weights?

SnooPears7079 t1_jecm2bt wrote on March 31, 2023 at 12:59 AM

Reply to [D] What are your top 3 pain points as an ML developer in 2023? by General-Wing-785

Best of luck with your startup

MentesInquisitivas t1_jeclydw wrote on March 31, 2023 at 12:59 AM

Reply to comment by gmork_13 in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679

ChatGPT is 3.5, they define that as having 100% and rate the rest accordingly. GPT-4 is only doing the evaluation.

joondori21 t1_jeckdey wrote on March 31, 2023 at 12:47 AM

Reply to [R] TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs - Yaobo Liang et al Microsoft 2023 by Singularian2501

There’s been like 5 renditions of the same concept of this already

gmork_13 t1_jecj9vo wrote on March 31, 2023 at 12:38 AM

Reply to comment by wind_dude in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679

It'll be filled with copies of people attempting weird jailbreaks haha

[deleted] t1_jecibgx wrote on March 31, 2023 at 12:31 AM

Reply to comment by Barton5877 in [R] The Debate Over Understanding in AI’s Large Language Models by currentscurrents

[deleted]

cathie_burry t1_jechk0t wrote on March 31, 2023 at 12:25 AM

Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679

Llama is not to be used for commercial purposes, but can I use something like this to code up part of my business?

pengo t1_jechdk0 wrote on March 31, 2023 at 12:24 AM

Reply to comment by Barton5877 in [R] The Debate Over Understanding in AI’s Large Language Models by currentscurrents

> The long and short of it being that "understanding" is never going to be the right term for us to use.

Yet still I'm going to say "Wow, ChatGPT really understands the nuances of regex xml parsing" and also say, "ChatGPT has no understanding at all of anything" and leave it to the listener to interpret each sentence correctly.

> I don't know to what degree LLMs have "latent" conceptual connectedness, or whether this is presented only in the response to prompts.

concept, n.

An abstract and general idea; an abstraction.
Understanding retained in the mind, from experience, reasoning and imagination

It's easy to avoid using "understanding" for being imprecise but it's impossible not to just pick other words which have the exact same problem.

Recent comments in /f/MachineLearning