Recent comments in /f/MachineLearning
2muchnet42day t1_jee275g wrote
Reply to comment by gahblahblah in [D][N] LAION Launches Petition to Establish an International Publicly Funded Supercomputing Facility for Open Source Large-scale AI Research and its Safety by stringShuffle
It's kinda ironic there already is an OpenAI and it's exactly the opposite.
Let us hope these initiatives get all the necessary support and that they stay loyal to their foundational concepts.
HerrMozart1 t1_jee26ot wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Very impressive results! Any chance to get access to the weights for research purposes?
Barton5877 t1_jee1fw4 wrote
Reply to comment by pengo in [R] The Debate Over Understanding in AI’s Large Language Models by currentscurrents
That the definition of concept you're citing here uses the term "understanding" is incidental - clearly it's a definition of concept in the context of human reasoning.
Whatever terminology we use ultimately for the connectedness of neural networks pre-trained on language is fine by me. It should be as precise to the technology as possible whilst conveying effects of "intelligence" that are appropriate.
We're at the point now where GPT-4 seems to produce connections that come from a place that's difficult to find or reverse engineer - or perhaps which simply come from token selections that are surprising.
That's what I take away from a lot of the discussion at the moment - I have no personal insight into the model's design, or the many parts that are stitched together to make it work as it does (quoting Altman here talking to Lex).
turnip_burrito t1_jee0mso wrote
Reply to comment by FermiAnyon in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
Here's what GPT4 chimed in with (it lines up with what I've heard and read before):
===========
In an N-dimensional space with N >> 1, where M random vectors are dispersed with each coordinate sampled from a uniform distribution within a bounded range, we can make a few observations about the distances between these vectors:
High-dimensional space behavior: In high-dimensional spaces, the behavior of distances between random points differs significantly from what we would expect in lower-dimensional spaces like 2D or 3D. In high-dimensional spaces, most points tend to be far away from each other, and the distances between them are often more similar than they would be in lower-dimensional spaces.
Concentration of distances: As the dimensionality N increases, the pairwise distances between the M random vectors tend to concentrate around a specific value. The distances between points in high-dimensional spaces tend to be less varied than in lower-dimensional spaces, and the majority of the distances will be relatively close to the expected value. This phenomenon is known as the "concentration of measure."
Sparse representation: In high-dimensional spaces, the points are often sparsely distributed, which means that there is a lot of empty space between them. This can lead to a counterintuitive property, where increasing the number of dimensions might actually make it easier to distinguish between the points, as they become more "isolated" in the space.
Curse of dimensionality: While the above-mentioned properties might sometimes be useful, high-dimensional spaces can also give rise to the "curse of dimensionality." This term refers to various challenges that arise when working with high-dimensional data, such as increased computational complexity, difficulty in visualizing the data, and the need for a larger number of samples to obtain statistically meaningful results.
To summarize, in an N-dimensional space with N >> 1, the pairwise distances between M random vectors will generally be far from each other and concentrate around a specific value. High-dimensional spaces exhibit unique properties like the concentration of measure and sparse representation, but they also come with challenges like the curse of dimensionality
gliptic t1_jee0fbk wrote
Reply to comment by yehiaserag in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Delta weights doesn't mean LoRA. It's just the difference (e.g. XOR) of their new weights and the original weights.
FermiAnyon t1_jee0e86 wrote
Reply to comment by turnip_burrito in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
Sounds legit :)
Philpax t1_jee04jo wrote
Reply to comment by General-Wing-785 in [D] What are your top 3 pain points as an ML developer in 2023? by General-Wing-785
It's just difficult to wrangle all of the dependencies; I want to be able to wrap an entire model in a complete isolated black box that I can call into with a C API or similar.
That is, I'd like something like https://github.com/ggerganov/llama.cpp/blob/master/llama.h without having to rewrite the entire model.
For my use cases, native would be good, but web would be a nice to have. (With enough magic, a native solution could be potentially compiled to WebAssembly?)
FermiAnyon t1_jee03oc wrote
Reply to comment by turnip_burrito in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
My pretty tenuous grasp of the idea makes me thing stuff like... if you're measuring Euclidean distance or cosine similarity between two points that represent concepts that are completely unrelated, what would that distance or that angle be? And that, ideally, all things that are completely unrelated, if you did a pairwise comparison, would have that distance or that angle. And that the embedding space is large enough to accommodate that. And it sounds to me like kind of a limit property that it may only be possible to approximate because there's like lots of ideas and only so many dimensions to fit them in...
ReasonableObjection t1_jedzrt7 wrote
Reply to comment by dansmonrer in [D] AI Explainability and Alignment through Natural Language Internal Interfaces by jackfaker
Thank you for your detailed response. So to be clear you are saying that things like emergent goals in independent agents or those agents having convergent instrumental goals are made up or not a problem? Do you have any resources that would describe intelligence or solving the alignment problem in ways that are not dangerous? I’m aware of some research that looks promising but curious if you have others.
ZetaReticullan t1_jedziv4 wrote
Reply to comment by gahblahblah in [D][N] LAION Launches Petition to Establish an International Publicly Funded Supercomputing Facility for Open Source Large-scale AI Research and its Safety by stringShuffle
You must be new to planet earth. The scenario you imagine, will NEVER see the light of day. If I have to explain that to you, then you're a looo-ng way from home - boyo!
ZetaReticullan t1_jedzbd5 wrote
Reply to comment by MrFlufypants in [D][N] LAION Launches Petition to Establish an International Publicly Funded Supercomputing Facility for Open Source Large-scale AI Research and its Safety by stringShuffle
!r/UsernameChecksOut
You're far to intelligent for that handle. No, you're NOT WRONG. What you wrote, is EXACTLY what would happen, because companies are not going to sit on their laurels and watch their opportunity to dominate the world, snatched away without a "fight" (fair or unfair).
hadaev t1_jedzbbc wrote
Reply to comment by MrFlufypants in [D][N] LAION Launches Petition to Establish an International Publicly Funded Supercomputing Facility for Open Source Large-scale AI Research and its Safety by stringShuffle
I think main idea is to open source whatever trained on this thing.
Open ai want to share their datasets and train new gpt? Well, nice for everyone.
ZCEyPFOYr0MWyHDQJZO4 t1_jedz3ps wrote
Reply to comment by wind_dude in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
I'm guessing there's some PII/questionable data that couldn't easily be filtered.
dansmonrer t1_jedya4g wrote
Reply to comment by ReasonableObjection in [D] AI Explainability and Alignment through Natural Language Internal Interfaces by jackfaker
I don't think intelligence in general is something machine learning people even want to define. Psychologists do, with different schools of thought, including behaviorism (which has heavily influenced reinforcement learning, and of which BF Skinner was one of the main figures) and then cognitivism, theory of mind... The few attempts I have seen at the interesction of psychological science and ML have been heavily backlashed from both sides, for both justified but also unjustified reasons. The truth is some people will probably have to go against the tide at some point but they will also need to ground very well their approach in existing frameworks. Conclusion: try to be excellent in both psycho and ML, the field you are describing has yet to become scientific.
ZetaReticullan t1_jedy3uf wrote
Reply to comment by pasr9 in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Citation(s) please.
ZetaReticullan t1_jedxxww wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
I said it before, and I'll say it again: this is a WILD time to be alive.
benfavre t1_jedx7pb wrote
Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
It's a pity that neither weights nor training data are made available.
ChuckSeven t1_jedwq3n wrote
Reply to comment by nullbyte420 in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
Would be nice to have a "philosophy for ai scientist" article just like the "machine learning mathematics for physicists" work. Something nice and concise.
gahblahblah t1_jedwifr wrote
Reply to [D][N] LAION Launches Petition to Establish an International Publicly Funded Supercomputing Facility for Open Source Large-scale AI Research and its Safety by stringShuffle
This could be it - true open AI. Maybe this could be the answer to AI alignment and democratising AI - empowering humanity as a whole. Disarming the arms race, and working in cooperation.
wind_dude t1_jedvs9b wrote
Reply to comment by gmork_13 in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
That’d actually be pretty cool to see, could train some classifiers pretty quick and pull some interesting stats on how people are using chatgpt.
Hoping someone publishes the dataset.
anothererrta t1_jedvpu5 wrote
Reply to comment by yehiaserag in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
If you read the blog post, you will actually see the weights mentioned.
[deleted] t1_jedvojw wrote
Reply to comment by big_ol_tender in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
[deleted]
[deleted] t1_jedvdeo wrote
Reply to comment by wind_dude in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
[deleted]
ChuckSeven t1_jeduwfs wrote
Reply to comment by MysteryInc152 in [D] Can large language models be applied to language translation? by matthkamis
The presented examples are intriguing but your general statements requires a proper evaluation. Afaik, no bilingual LLM has yet beaten the state of the art on an established translation benchmark.
pm_me_your_pay_slips t1_jee2xtt wrote
Reply to comment by phire in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679
Perhaps applicable to the generated outputs of the model, but it’s not a clear case for the inputs used as training data. It could very well end up in the same situation as sampling in the music industry. Which is transformative, yet people using samples have to “clear” them by asking for permission (usually involves money).