Viewing a single comment thread. View all comments

genshiryoku t1_j56btvq wrote

The problem is the total amount of data and the quality of the data. Humans using an AI like GPT-3 doesn't generate nearly enough data to properly train a new model, not even with decades of interaction.

The demand for training data scales logarithmically with the parameter scale of the transformer model. This essentially means that mathematically Transformer models are a losing strategy and isn't going to lead to AGI unless you had unlimited amount of training data, which we don't.

We need a different architecture.

9