Submitted by SejaGentil t3_xyv3ht in MachineLearning
GPT-3 has a prompt limit of about ~2048 "tokens", which corresponds to about 4 characters in text. If my understanding is correct, a deep neural network is not learning after it is trained and is used to produce an output, and, as such, this limitation comes from amount of the input neurons. My question is: what is stopping us from using the same algorithm we use for training, when using the network? That would allow it to adjust its weights and, in a way, provide a form of long-term memory which could let it handle prompts with arbitrarily long limits. Is my line of thinking worng?
asterfield t1_irjc3wd wrote
Disclaimer: I barely know what I’m doing, fact check me.
What you’re describing is called online learning. It can be done, but I imagine it doesn’t work well unless you have a clear signal on what “correct” was supposed to be for an example output.
You could use user feedback as a quality signal, but you need a way to trust that the user feedback is correct enough to integrate as new training data.
This is all probably possible, but it’s layers of added complexity that they might not be interested in right now