Recent comments in /f/MachineLearning
machineko t1_jecw2v4 wrote
Reply to comment by darkbluetwilight in [D]Suggestions on keeping Llama index cost down by darkbluetwilight
Cerebras-GPT models are Apache-2.0. You should be able to use them for free. Not sure what you mean by charges. Are you referring to using the hosted APIs?
Btw, you should use the ones that are instruction fine-tuned.
machineko t1_jecvhyt wrote
Reply to comment by Evening_Ad6637 in [D] Training a 65b LLaMA model by Business-Lead2679
16gb of RAM is not enough for even the smallest LLaMA 7b model. You can try doing LoRA with int8 listed above. Did you try the python script I linked above?
cthorrez t1_jecv5ox wrote
Code quality of my own and my team's code
The reliability of the engineering platforms we use. (spark, gpu clusters, ci/cd build pipelines)
The correctness and completeness of the data we ingest
AlmightySnoo t1_jecum2v wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
I think this sub should start enforcing the explicit mention of "NOT FREE (AS IN FREEDOM)" in the title and/or flair when people use the word "open-source" when there are restrictions in place. Yes technically there's no lie, but it's still misleading (often intentionally) since many conflate open-source with free software (proof in the comments when you have people asking about it). We should be discouraging this trend of "Smile! You should be happy I'm showing you the code, but you should only use it the way I tell you to" that OpenAI started, it's a huge regression and it feels like we're back to the dark days before the GPL.
Art10001 t1_jecukzj wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
A Koala model is listed on the site. What is it?
Purplekeyboard t1_jecuaja wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
>Relative Response Quality Assessed by GPT-4
There's no way Bard is 93% as good as ChatGPT. Bard is dumb as hell, comparatively.
phire t1_jects6y wrote
Reply to comment by ktpr in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
It gets a bit more complicated.
OpenAI can't actually claim copyright on the output of ChatGPT, so licensing something trained on ChatGPT output as MIT should be fine from a copyright perspective. But OpenAI do have terms and conditions that forbid using ChatGPT output to train an AI... I'm not sure how enforceable that is, especially when people put ChatGPT output all over the internet, making it near impossible to avoid in a training set.
As for retraining the LLaMA weights... presumably Facebook do hold copyright on the weights, which is extremely problematic for retraining them and relicensing them.
kulchacop t1_jectkc3 wrote
Reply to comment by joondori21 in [R] TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs - Yaobo Liang et al Microsoft 2023 by Singularian2501
- TaskMatrix.AI
- LangChain
- Toolformer
- ChatGPT plugins
What else we got?
Art10001 t1_jectiy3 wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Soon: Huemul, Pudu, Condor.
itsyourboiirow t1_jecqjqd wrote
Reply to comment by Evening_Ad6637 in [D] Training a 65b LLaMA model by Business-Lead2679
Training requires a significant more amount of memory as it it has to keep track of the gradient for every parameter. I would check to see how much memory it takes up on your computer.
itsyourboiirow t1_jecqc1d wrote
Reply to comment by Nhabls in [D] Training a 65b LLaMA model by Business-Lead2679
This is the only downside I've found. Sometimes it's too darn hard to find an instance.
General-Wing-785 OP t1_jeconzd wrote
Reply to comment by SnooPears7079 in [D] What are your top 3 pain points as an ML developer in 2023? by General-Wing-785
Thank you!
[deleted] t1_jecolnj wrote
[deleted]
ktpr t1_jeco4so wrote
Reply to comment by cathie_burry in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
I feel like a lot of folks are missing this point. They retraining on ChatGPT output or LLaMA related output and assume they can license as MIT or some such.
nmfisher t1_jeco3nx wrote
Reply to comment by Business-Lead2679 in [D] Training a 65b LLaMA model by Business-Lead2679
Someone also mentioned https://jarvislabs.ai/ to me the other day, haven't used it myself but it looks promising.
big_ol_tender t1_jecny45 wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Stop claiming fine tuned llama models as open source. It’s not open source and we can’t use them for anything real.
MentesInquisitivas t1_jecm73g wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Wheres the weights?
SnooPears7079 t1_jecm2bt wrote
Best of luck with your startup
MentesInquisitivas t1_jeclydw wrote
Reply to comment by gmork_13 in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
ChatGPT is 3.5, they define that as having 100% and rate the rest accordingly. GPT-4 is only doing the evaluation.
joondori21 t1_jeckdey wrote
Reply to [R] TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs - Yaobo Liang et al Microsoft 2023 by Singularian2501
There’s been like 5 renditions of the same concept of this already
gmork_13 t1_jecj9vo wrote
Reply to comment by wind_dude in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
It'll be filled with copies of people attempting weird jailbreaks haha
[deleted] t1_jecibgx wrote
Reply to comment by Barton5877 in [R] The Debate Over Understanding in AI’s Large Language Models by currentscurrents
[deleted]
cathie_burry t1_jechk0t wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Llama is not to be used for commercial purposes, but can I use something like this to code up part of my business?
pengo t1_jechdk0 wrote
Reply to comment by Barton5877 in [R] The Debate Over Understanding in AI’s Large Language Models by currentscurrents
> The long and short of it being that "understanding" is never going to be the right term for us to use.
Yet still I'm going to say "Wow, ChatGPT really understands the nuances of regex xml parsing" and also say, "ChatGPT has no understanding at all of anything" and leave it to the listener to interpret each sentence correctly.
> I don't know to what degree LLMs have "latent" conceptual connectedness, or whether this is presented only in the response to prompts.
concept, n.
-
An abstract and general idea; an abstraction.
-
Understanding retained in the mind, from experience, reasoning and imagination
It's easy to avoid using "understanding" for being imprecise but it's impossible not to just pick other words which have the exact same problem.
pasr9 t1_jecwbqm wrote
Reply to comment by ktpr in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
AI output is not currently copyrightable in the US.