Viewing a single comment thread. View all comments

chaosmosis t1_j45vdll wrote

With enough scale we get crude compositionality, yes. That trend will probably continue, but I don't think it'll take us to the moon.

3

yldedly t1_j45ycm8 wrote

>With enough scale we get crude compositionality, yes.

Depends on exactly what we mean. To take a simple example, if you have cos(x) and x^2, you can compose these to produce cos(x)^2 (or cos(x^2)). You can approximate the composition using a neural network if you have enough data on some interval x \in [a;b]. It will work well even for x that weren't part of the training set, as long as they are in the interval. Outside the interval the approximation will be bad though. But if you take cos(x), x^2 and compose(f, g) as building blocks, and search for a combination of these that approximate the data, the approximation will be good for all real numbers.

In the same way, you can learn a concept like "subject, preposition, object A, transitive verb, object B", where e.g. subject = "raccoon", preposition = "in a", object A = "spacesuit", transitive verb = "playing" and object B = "poker", by approximating it with a neural network, and it will work well if you have enough data in some high-dimensional subspace. But it won't work with any substitutions. Is it fair to call that crude compositionality?

4