o_snake-monster_o_o_ t1_iu4d33q wrote
Reply to comment by j4nds4 in [R] "Re3: Generating Longer Stories With Recursive Reprompting and Revision" - Generating stories of 2000+ words (or even much longer) by 0xWTC
I'm pretty sure this is how we're gonna advance these models to the next step. It's a lot easier to think about these things in the context of coding, because coding is thinking but in a very restricted symbolic world.
For example, the next step for coding language models will be to implement a command language to let them query the code and get information/intelligence from it (a LSP for example). Then we use some sort of RL algorithm or a hypernetwork to finetune how the context should be written and organized to maximize efficiency, which information to drop to make room for new information, etc.
We have this huge GPT context window but we're filling it up with so noise! Humans work with highly augmented data, for example the syntax highlighting in our code editor, so why are we not augmenting GPT-3's input?
Viewing a single comment thread. View all comments