Recent comments in /f/MachineLearning

Pas7alavista t1_jeg5dhh wrote

>so the extra dimensions are unnecessary

Yes one reason for embedding is to get extract relevant features.

Also, any finite dimensional inner product space has an orthonormal basis, and the math is easiest this way so there's not much of a reason to describe a space using non orthogonal dimensions. There is also nothing stopping you from doing so though.

>Doesn't it suggest a pattern in data if a mapping is found that reduces dimension

Yeah generally you wouldn't attempt to use ML methods on data where you think there is no pattern

>Something something Linear algebra

I think you might be thinking about the span and or basis but it's hard for me to interpret your question

2

darthmeck t1_jeg46cc wrote

I don’t know how they’d go about doing this but there need to be provisions that it can never become a for-profit agency. OpenAI gained traction by doing cutting-edge research and touting it as open to the public (or at least researchers) and then pulled the rug out from under everyone when they struck gold. In case the LAION discovers a new architecture that dwarfs the capability of LLMs, they should never be able to say “ok time to start a company and mint billions now!”.

7

Dapper_Cherry1025 t1_jefywqj wrote

Well, that's probably because I specifically asked it to use an internal monologue. I think what I'm trying to say is that each part of its response does seem to flow in a logical way that I found easy to understand. Heck, when I refined my prompt down for 3.5 I was able to get it to admit that it couldn't come up with a solution when I tried to get a more complicated example.

I also find it very interesting that when chatgpt starts a sentence with something like "Yes, because..." I know right away that the answer is probably incorrect, because after it replies "Yes" it will then try to justify the yes even if it is wrong. However, if you can get it to investigate a problem like shown in the example it can actually try different things before arriving at a solution.

6

turnip_burrito t1_jefysiz wrote

My prompt:

> Suppose I have an N>>1 dimensional space, finite in extent along any given axis, in which a set of M random vectors are dispersed (each coordinate of each vector is randomly sampled from a uniform distribution spanning some bounded range of the space). What can we say about the distances in this space between the M vectors?

I left my prompt open ended to not give it any ideas one way or another.

Its response makes sense to me. The standard deviation of a set of random samples from a uniform distribution centered at mean 0, which is proportional to the distance calculated here, should shrink as dimension N grows. If N is large, then the distribution of pairwise distances will narrow until nearly all points are roughly the same distance from each other. (The random sampling is a way to build in lack of correlation, like how you mentioned unrelated ideas)

Of course, the reverse is also true: if dimension N is small, then originally "far" points will become closer or farther (which one effect exactly is unpredictable depending on which dimensions are removed) because the averaging over random sample fluctuations disappears.

2

mattsverstaps t1_jefvu45 wrote

So the extra dimensions are unnecessary? I just realised that there could be some situations in which non orthogonal dimensions are preferable. I can’t exactly think of them. Doesn’t it suggest a pattern in data if a mapping is found that reduces the dimension? Like I picture from linear algebra 101 finding a line that everything is a multiple of so one dimension would do and that line is a ‘pattern’? Sorry I’m high.

2

monks-cat t1_jefqotb wrote

Context radically changes the "distance" between concepts. So in your example isotropy isn't necessarily a desired property of a LLM. In poetry, for example, we combine two concepts that would seemingly be very far apart in the original space but should be mapped rather closely in the embedding.

​

The problem I see with this whole idea though is that a "concept" doesn't inherently seem to be represented by list of features. Two concepts interacting aren't necessarily the intersection of their features.

I'll try to see if I can come up with concrete examples in language.

2

lacker t1_jefpz2c wrote

I’m a big fan of open source AI research, but creating a new facility doesn’t seem like the way to go. If you’re making a GPU cluster that has to be shared among a bunch of different academic groups, you’ll have to build resource-sharing software, infrastructure tools, etc, and spend all this money on what is essentially an AWS clone.

Wouldn’t it be more effective to simply give this money to AI research groups and let them buy infrastructure from the most cost-effective provider? If AWS works best, fine, if it’s some smaller infrastructure provider, that’s fine too.

This proposal seems like it would actually divert money away from AI, by spending a lot of money rebuilding the standard cluster infrastructure stuff that cloud providers already have.

9

ReasonableObjection t1_jefpe2n wrote

Thank you so much for the thoughtful reply!
Will read into these and may reach out to you with other questions.
Edit - as far as how I'm feeling... at the moment just curious, been asking lots of questions about this the last few days and reading any resources people are kind enough to share :-)

2

WikiSummarizerBot t1_jefl95t wrote

Landauer's principle

>Landauer's principle is a physical principle pertaining to the lower theoretical limit of energy consumption of computation. It holds that an irreversible change in information stored in a computer, such as merging two computational paths, dissipates a minimum amount of heat to its surroundings.

Solomonoff's theory of inductive inference

Solomonoff's uncomputability

>Unfortunately, Solomonoff also proved that Solomonoff's induction is uncomputable. In fact, he showed that computability and completeness are mutually exclusive: any complete theory must be uncomputable. The proof of this is derived from a game between the induction and the environment. Essentially, any computable induction can be tricked by a computable environment, by choosing the computable environment that negates the computable induction's prediction.

^([ )^(F.A.Q)^( | )^(Opt Out)^( | )^(Opt Out Of Subreddit)^( | )^(GitHub)^( ] Downvote to remove | v1.5)

1

grotundeek_apocolyps t1_jefl7kd wrote

The crux of the matter is that there are fundamental limitations to the power of computation. It is physically impossible to create an AI, or any other kind of intelligent agent, that can overpower everything else in the physical world by virtue of sheer smartness.

Depending on where you're coming from this is not an easy thing to understand, it usually requires a lot of education. The simplest metaphor that I've thought of is the speed of light: it seems intuitively plausible that a powerful enough rocket ship should be able to fly faster than the speed of light, but actually the laws of physics prohibit it.

Similarly, it seems intuitively plausible that a smart enough agent should be able to solve any problem arbitrarily quickly, thereby enabling it to (for example) conquer the world or destroy humanity, but that too is physically impossible.

There are a lot of ways to understand why this is true. I'll give you a few places to start.

  • landauer's principle: infinite computation would require infinite resources
  • solomonoff induction is uncomputable: the optimal general method of bayesian induction is literally impossible to compute even in principle
  • chaotic dynamics cannot be predicted: control requires prediction, but the finite precision of measurement and the aforementioned limits on computation mean that our control over the world is fundamentally limited and intelligence can never overcome this fact

The people who have thought about this "for 30+ years" and come to a different conclusion are charlatans. I don't know of a gentler way of putting it. What do you tell someone when they ask you to explain why someone who has been running a cult for 30 years isn't really talking directly to god?

Something to note on the more psychological end of things is that a person's ability to understand things is fundamentally limited by their understanding of their own emotions. The consequence of this is that you should also be thinking about how you're feeling when you're reading hysterical nonsense about the robot apocalypse, because that's going to affect how likely you are to believe things that aren't true. People often fixate on things that have a strong emotional valence, irrespective of their accuracy.

2

KerfuffleV2 t1_jefkhxs wrote

> Something about these distillations feels fundamentally different than when interacting with the larger models.

It may not have anything to do with size. ChatGPT is just adding a lot of comfort-phrases into its response instead of just responding. "Hmm, this is an interesting challenge", "Let's see", etc. Some of that may be based on the system prompt, some of it may be training to specifically produce more natural sounding responses.

All "Hmm", "interesting challenge" and stuff that makes it sound like a person isn't actually adding any actual information that's relevant to answering the query though. (Also, you may be paying for those extraneous tokens.)

6