Recent comments in /f/MachineLearning
fawkesdotbe t1_jedi3oo wrote
Reply to comment by Disastrous_Elk_6375 in [D][N] LAION Launches Petition to Establish an International Publicly Funded Supercomputing Facility for Open Source Large-scale AI Research and its Safety by stringShuffle
Frankly EuroHPC has been very good with that. See eg LUMI: https://lumi-supercomputer.eu/lumi-consortium/
EuphoricPenguin22 t1_jedhyci wrote
Reply to comment by phire in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Using training data without explicit permission is (probably) considered to be fair use in the United States. There are some currently active court cases relating to this exact issue here in the U.S., namely Getty Images (US), Inc. v. Stability AI, Inc. The case is still far from a decision, but it will likely be directly responsible for setting a precedent on this matter. There are a few other cases happening in other parts of the world, and depending on where you are specifically, different laws or regulations may already be in place that clarify this specific area of law. I believe there is another case against Stability AI in the UK, and I've heard that the EU was considering adding or has added an opt-out portion of the law; I'm not sure.
ReasonableObjection t1_jedg8ki wrote
Reply to comment by grotundeek_apocolyps in [D] AI Explainability and Alignment through Natural Language Internal Interfaces by jackfaker
You could point me to some fundamental research around intelligence in general (forget the math or the computers) because we already demonstrated that it does not matter if intelligence emerges biologically or artificially, or if it was coded in silicone or some disgusting wetware... what matters is the emergent behaviors that result of it.
That's the part I don't think people understand, you can remove the computers, remove the humans, remove the coding and just think about how an intelligent agent would work in any environment and you arrive at the same dangerous conclusions. We don't currently know how to solve for them.
You observe them in any intelligence, after all anybody can argue not everything humans do is beneficial to what mother nature programed us to do (make more babies).
Again, this is not about math or coding... we just haven't solved some basic questions...
For example, can you give me a definition of intelligence where an inferior general intelligent agent (biological or not) would be able to control a superior one over a long enough timeline? Because all of our definitions currently lead us to conclude the answer is no.
Also if you have done any looking into this you would realize even if we could solve these problems we currently lack the capabilities to code them into the models to make sure they are safe.
I'm not trying to overwhelm you with doomer arguments... I'm Genuinely curious and searching for some opposing views that are actually researched and thought out, vs some hand waving about how we will fix it in prod... or hahaha you so dumb cause you think skynet is coming (this tech will be able to kill us long before it is as cool as skynet)... I'm asking for evidence because the people that actually have thought about this for 30+ years still haven't been able to solve these very basic, non-math and non-computer coding related problems.
Any serious researcher that wants to continue despite the danger assumes we will solve the problems before we run out of runway, not that we have solved any of these problems...
I would absolutely bet on human ingenuity to solve these problems given enough time based on the history of human ingenuity... the danger is we will run out of time, not that we can't solve the problem... unfortunately for this particular problem we don't get a do-over like others...
Disastrous_Elk_6375 t1_jedf0de wrote
Reply to [D][N] LAION Launches Petition to Establish an International Publicly Funded Supercomputing Facility for Open Source Large-scale AI Research and its Safety by stringShuffle
This is a great initiative! Let's just hope that it doesn't go the way of that classic joke: the french will build it, the germans will make it funny, the british will teach it about food, etc...
trashacount12345 t1_jeddoei wrote
Reply to comment by TitusPullo4 in [R] The Debate Over Understanding in AI’s Large Language Models by currentscurrents
This is the agreed upon definition in philosophy. I’m not sure what another definition would be besides “it’s not real”.
maizeq t1_jedcga5 wrote
Reply to comment by roselan in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
~6.5gb if 4bit quantised.
grotundeek_apocolyps t1_jedbi44 wrote
Reply to comment by ReasonableObjection in [D] AI Explainability and Alignment through Natural Language Internal Interfaces by jackfaker
There is indeed a community of people who think in the way that you have described. They don't know what they're talking about and their research agenda is fundamentally unsound. Nothing that you just wrote is based on an accurate understanding of the science or mathematics of real machine learning.
I'd like to give you a better response than that but I'm honestly not sure how to. What does someone say to someone who is interested in math and who is very enthusiastic about the problem of counting the angels on the head of a pin? I say that not to be insulting but to illustrate the magnitude of the divide between how you perceive the field of machine learning and the reality of it.
ReasonableObjection t1_jed8z42 wrote
Reply to comment by grotundeek_apocolyps in [D] AI Explainability and Alignment through Natural Language Internal Interfaces by jackfaker
The one part you are wrong is about an "AI agent harming humans despite having been designed not to do so". We don't currently know how to code a model that does not devolve into harming humans even when we are trying to code it not to do so...Keep in mind a lot of the academic and theoretical concerns that have been discussed over the last 30 years and sounded like science fiction have now become very real in the last 6 months (ahead of most serious researcher's timelines and assumed degree of difficulty) and are currently being demonstrated by existing models like chat GPT.
I don't understand how it is a made-up concern when all the ways we have to train these models lead to these negative end-states and we don't currently have a solution to the problem... and none of this is a surprise considering this is exactly what we observe happening in the real-world when it comes to emergent intelligent behavior (artificial or biological).
I also don't think a lot of people understand that these are not "coding" problems... we cannot solve the very basic problems that arise from intelligence even before we can code them into a model.
Even if we could solve these problems (and I'm not one to bet against human Enginuity over a long enough timeline), it is important to understand we can't currently code the solutions into our models. We don't code these models, we train them, and we can only observe their external output and alignment and infer that everything is fine. We have no solution for verification on even this...We have not even got into the bigger problem that we have no idea what is going on inside to make them spit out those outputs and even more importantly no way to probe their inner alignment (a whole other problem)
I agree with you that until we develop an AGI that is superior to a human agent none of that matters, but the danger is we don't understand how these models work any better than we understand how a human brain works and we by definition we won't know when the point of no return has been crossed because of how we design these things...
The danger is running out of runway before somebody accidentally crosses the threshold... If that happens the whole "bad actor uses AI to do bad things" will be the least of our worries. I would argue we already live in that world if you look at things like social media, Facebook newsfeed and etc...
It is really important for people to understand it can get a lot worse... way worse than they can imagine because we humans would not be able to imagine it by definition (we would not be as intelligent).
Edit - the only part where I would argue with you that you are wrong... I absolutely agree with you that less than AGI capabilities can and unfortunately will do huge amounts of damage before an AGI ever becomes a threat... hell if we lucky we will use those to kill ourselves before an AGI does so cause it can be worse if they do it...
Philpax t1_jed8vd0 wrote
Deploying anything developed with Python to a end-user's machine
Philpax t1_jed8s0x wrote
shitasspetfuckers t1_jed7vuu wrote
Reply to comment by SeymourBits in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
Can you please clarify what specifically you have tried, and what was the outcome?
shitasspetfuckers t1_jed796l wrote
Reply to comment by Qzx1 in [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
> Google's Spotlight paper
https://ai.googleblog.com/2023/02/a-vision-language-approach-for.html
grotundeek_apocolyps t1_jed6u66 wrote
Reply to comment by ReasonableObjection in [D] AI Explainability and Alignment through Natural Language Internal Interfaces by jackfaker
There are real concerns about the impacts of AI on the world, and they all pertain to the ways in which humans choose to use it. That is not the subject matter under consideration in "AI alignment" or "AI safety"; the term that is used for this is usually "AI ethics".
"AI alignment" /"safety" is about trying to prevent AIs from autonomously deciding to harm humans despite having been designed to not do so. This is a made up concern about a type of machine that doesn't exist yet that is predicated entirely on ideas from science fiction.
EvenAtTheDoors t1_jed5rft wrote
Reply to comment by Dapper_Cherry1025 in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Yeah, I know what you’re talking about. The lower parameter model output text that doesn’t truly synthesize new information in surprising ways. It’s often shallow and comes off artificial. Even though it knows a lot it seems like a sophisticated search engine rather than an actual language model.
phire t1_jed57od wrote
Reply to comment by pasr9 in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Hang on, that guidence only covers generated outputs, not weights.
I just assumed weights would be like compiled code, which is also a fully mechanical process, but copyrightable because of the inputs.... Then again, most of the training data (by volume) going into machine learning models isn't owned by the company.
yehiaserag t1_jed4dee wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
I'm lost, it says open-source... and I can't see any mentioning of the weights, a download link or a huggingface repo.
On the website it says "We plan to release the model weights by providing a version of delta weights that build on the original LLaMA"
Please no lora for that, lora is always associated with degraded inference quality.
ReasonableObjection t1_jed3b66 wrote
Reply to comment by grotundeek_apocolyps in [D] AI Explainability and Alignment through Natural Language Internal Interfaces by jackfaker
What do you mean?
ATX_Analytics t1_jed1xnu wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
It’s as good as bard but way off from ChatGPT. Pretty neat though
ASlowDanceWithDeath t1_jed1ezw wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Will you be making the weights available?
polawiaczperel t1_jed1e9h wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
I was playing with Llama 7b, 13b, 30b, 65b, Alpaca 30b native and lora, but this seems to be much better, and it is only 13b. Nice! Will they share the weights?
Dapper_Cherry1025 t1_jecz0th wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Something about these distillations feels fundamentally different than when interacting with the larger models. The responses feel a lot more... I don't really know? Artificial? Weird way to phrase it, but I definitely get a sense that this method seems to be missing something fundamental, not to say that it couldn't be useful in other cases. Like, to me it is lacking some "spark" of intelligence that you can sorta see with GPT-3.5 and definitely see with GPT-4.
That being said however, more models to compare and contrast against will always be welcome! And Vicuna does seem able to produce text that is quite amazing for its size! Hell, considering where we were 2 years ago to today it'll be really exciting to see how far these approaches can go in these next couple of months/years.
inglandation t1_jecxwxl wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
What happens when we run out of camelids to name those models?
BoiElroy t1_jecx0mw wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Is it "open source" though? ...
If anyone knows, I'd be curious also if you took a model that was not open source and then fine tuned it but unfreezing the weights of some intermediate layers, will it just always be not open source because of the initial state?
pasr9 t1_jecwvck wrote
Reply to comment by phire in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Facebook do not hold copyright to the weights for the same reasons they do not hold copyrights to the output of their models. Neither the weights or output meet the threshold of copyrightablitity. Both are new works created out of a purely mechanical process that lack direct human authorship and creativity (two of the prerequisites required for copyright to apply).
For more information: https://www.federalregister.gov/documents/2023/03/16/2023-05321/copyright-registration-guidance-works-containing-material-generated-by-artificial-intelligence
Adventurous-Mouse849 t1_jedi4wq wrote
Reply to comment by viertys in [D] Improvements/alternatives to U-net for medical images segmentation? by viertys
For augmentation that’s all bases covered. For more high-level or fully generative tasks I would also suggest mix-match (convex combo between similar samples). But you can’t justify that here bc you would have to relabel. Ultimately this does come down to too few images. If there’s a publicly available pretrained CT segmentation model you could fine-tune it to your task, or distill it’s weights to your model… just make sure they did a good job in the first place.
Also some other notes: I’d suggest sticking with distribution losses ie cross entropy. U-Net is sensitive to normalization so I’d also suggest training with and without normalized inputs.