Recent comments in /f/MachineLearning
kulchacop t1_jefjgf9 wrote
Reply to [D][N] LAION Launches Petition to Establish an International Publicly Funded Supercomputing Facility for Open Source Large-scale AI Research and its Safety by stringShuffle
Origins of Communism without human governance /s
aliasaria t1_jefj33h wrote
Reply to comment by lxe in [R] LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention by floppy_llama
It's a very different way to finetune a model efficiently.
All these tools try to nudge an existing large model, without having to nudge all the weights.
A simplistic explanation of LoRA is that LoRA looks at the whole pretrained model and tries to identify only the most influential weights, and nudge those only.
This tool, instead, adds weights to the model (at the start of prompts) in addition to the existing model.
One advantage to LoRA, in this case, is that you can merge your LoRA finetuned weights into the original model and the result is a new model that is exactly the same size and shape as the original model. In the technique in this paper, however, the final model is a different shape from the original model. But the concept is sort of simpler.
currentscurrents t1_jefj08w wrote
Definitely a GPU farm, probably NVidia A100s, but nobody knows for sure because it's closed source.
If you want to run an image generator locally, head over to /r/StableDiffusion
LartoriaPendragon t1_jefiqoi wrote
Reply to [D] Simple Questions Thread by AutoModerator
What programming languages besides Python are often used in industry for machine learning applications or projects? What are some relevant technologies I should be looking to learn?
aliasaria t1_jefih93 wrote
Reply to comment by CasulaScience in [R] LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention by floppy_llama
A short answer is that it is "just different". It's another way to tweak an existing LLM to do another task, without having to finetune the whole system. Conceptually, this way is simpler than LoRA and seems to work as well or better.
In the paper, the authors mention that one advantage is that you can use this technique to add new modalities. The whole method works by adding to the prompt at the top most layer(s), so you can add not just words, you could add tokens that come from an image. They have an example on the top of page 4 with a picture of a baby opening a door.
bacocololo t1_jefgty4 wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
I don t understand if art made by an ia is free so chatgpt let s say it art :)…
mejdounarodni t1_jeff83b wrote
Reply to [D] Simple Questions Thread by AutoModerator
Hey, I don't know how relevant this is, but is there any voice cloning tools for other important languages aside from English? Such as Spanish, Russian, Mandarin Chinese... Thus far I have only found it for English and I think French. I have seen some sites claiming they work for other languages since arguably you type in the text in any language you want... only the phonemes used to recreate what you have written are those of the English language so it's a bit absurd, really. Any tips would be appreciated.
hangerguardian t1_jefcq6e wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
I don't know if you can call it an open source model without releasing the model...
Pas7alavista t1_jefcp4e wrote
Reply to comment by mattsverstaps in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
Embedding is a way to map the high dimension vectors in your input space to a lower dimension space.
[deleted] t1_jefbosq wrote
Sopel97 t1_jefa735 wrote
Reply to comment by phire in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
"terms and conditions" means that at worst openai will restrict your access to chatgpt, no?
farleyknight t1_jef8a4v wrote
Reply to comment by MentesInquisitivas in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
I had the exact same question! Just found on the GitHub page
> We plan to release the model weights by providing a version of delta weights that build on the original LLaMA weights, but we are still figuring out a proper way to do so. In this example, we demonstrate the usage of our distributed serving system using OPT models. Later, you can apply similar commands to serve Vicuna, just as shown in our demo.
LetGoAndBeReal t1_jef6vjx wrote
Reply to comment by Philpax in [R] TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs - Yaobo Liang et al Microsoft 2023 by Singularian2501
Right, ReAct seems to be the core pattern that everyone - including LangChain with their agents and OpenAI with their plugins - is using.
Scew t1_jef6kiu wrote
Reply to comment by 2muchnet42day in [D][N] LAION Launches Petition to Establish an International Publicly Funded Supercomputing Facility for Open Source Large-scale AI Research and its Safety by stringShuffle
Lol they opened up their AI to be trained for free for "research purposes..." Sounds similar to how certain corporations greatly profited from recent current events over the passed couple of years... Wonder if they'll even go as far as calling people some kind of hero for helping them make a bigger profit >.>
Scew t1_jef616t wrote
Reply to [D][N] LAION Launches Petition to Establish an International Publicly Funded Supercomputing Facility for Open Source Large-scale AI Research and its Safety by stringShuffle
If a corporation were trying to depreciate an open-source alternative to one of their projects, it might look like spreading negative propaganda about the open-source alternative or highlighting the perceived weaknesses of the alternative. For example:
FUD: The corporation may spread Fear, Uncertainty, and Doubt (FUD) about the open-source alternative, such as by suggesting that it is not secure, reliable, or compatible with other systems.
Highlighting perceived weaknesses: The corporation may highlight perceived weaknesses of the open-source alternative, such as by emphasizing areas where it falls short compared to the corporation's proprietary solution.
Undermining community support: The corporation may attempt to undermine community support for the open-source alternative by spreading misinformation about the project's development or suggesting that it lacks the necessary resources to succeed.
Offering alternative solutions: The corporation may offer alternative solutions that they claim are superior to the open-source alternative, such as by highlighting their own proprietary products or services.
Funding competitors: The corporation may fund competitors who are developing similar solutions to the open-source alternative, with the intention of creating negative publicity or drawing attention away from the alternative.
These tactics can be effective in diminishing support for the open-source alternative, but they can also be perceived as unethical and manipulative, potentially damaging the corporation's reputation and relationship with the open-source community.
Jean-Porte t1_jef5jjj wrote
Reply to comment by kulchacop in [R] TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs - Yaobo Liang et al Microsoft 2023 by Singularian2501
Singularitynet
KD_A OP t1_jef4p6g wrote
Reply to comment by nbviewerbot in [P] CAPPr: use OpenAI or HuggingFace models to easily do zero-shot text classification by KD_A
ty bro
nbviewerbot t1_jef48jq wrote
Reply to [P] CAPPr: use OpenAI or HuggingFace models to easily do zero-shot text classification by KD_A
I see you've posted a GitHub link to a Jupyter Notebook! GitHub doesn't render large Jupyter Notebooks, so just in case, here is an nbviewer link to the notebook:
https://nbviewer.jupyter.org/url/github.com/kddubey/cappr/blob/main/demos/copa.ipynb
Want to run the code yourself? Here is a binder link to start your own Jupyter server and try it out!
https://mybinder.org/v2/gh/kddubey/cappr/main?filepath=demos%2Fcopa.ipynb
^(I am a bot.) ^(Feedback) ^(|) ^(GitHub) ^(|) ^(Author)
Snohoe1 t1_jef1g8b wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Hope they release the weights before facebook decides to dcma or something.... We need to break openais monopoly sooner than later.
vin227 t1_jeezsuf wrote
Reply to comment by fawkesdotbe in [D][N] LAION Launches Petition to Establish an International Publicly Funded Supercomputing Facility for Open Source Large-scale AI Research and its Safety by stringShuffle
Plenty of teething pains with LUMI sadly. Maybe some day it will be reliable.
GirlScoutCookieGrow t1_jeeyz2u wrote
Reply to "[D]" Is wandb.ai worth using? by frodo_mavinchotil
You don't have to log your code. I don't think there's much sense in being paranoid about it anyways. What do you think will happen, they go thru everyone's code and try to steal ideas? That's silly
LinuxSpinach t1_jeexz48 wrote
Reply to comment by Alarming_Turnover578 in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Only the code for initializing and training the model has been released under GPL... which leaves a substantial gap toward having anything useful. You would still have to replicate all of the training to produce weights that you can use commercially, which is a bridge too far for most individuals and small businesses.
AllowFreeSpeech t1_jeevp3b wrote
Reply to [D][N] LAION Launches Petition to Establish an International Publicly Funded Supercomputing Facility for Open Source Large-scale AI Research and its Safety by stringShuffle
What bothers me is that most researchers don't care to use any model compression or efficiency techniques. They want others to pay for their architectural inefficiencies. IMO such funding could be a bad idea if it were to stop competition of neural architectures, and a good idea otherwise.
For example, is matrix-matrix multiplication necessary or can matrix-vector multiplication do the job? Similarly, are dense networks necessary or can sparse networks do the job? Alternatively the funding can go toward the engineering of optical and analog hardware that is significantly more power efficient.
[deleted] t1_jeevamd wrote
Reply to comment by pasr9 in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
[removed]
shn29 OP t1_jefkbkb wrote
Reply to comment by currentscurrents in [D] [R] On what kind of machine does Midjorney, the art generating AI, runs on? by shn29
Very closed! Couldn't find nothing, it sure is fun to play with and guess they're training it that way, Like nothing on what they plan on to do with it in the future etc.