master3243 t1_izv48yc wrote
Reply to comment by patient_zer00 in [D] - Has Open AI said what ChatGPT's architecture is? What technique is it using to "remember" previous prompts? by 029187
This is it, they have a huge context size and they just feed it in.
I've seen discussion on whether they use some kind of summarization to be able to fit more context into the same size model but there's only speculation in that regards.
In either case, it's nothing we haven't seen in recent papers here and there.
maxToTheJ t1_izvltcw wrote
It probably does some basic checks for adversarial text like putting AAAAAAAAA*, BBBBBBBBBBBBB*, [[[[[[[[*, or profanity profanity profanity then preprocesses the text before inputting.
EDIT: Only mentioning since some folks will argue chatGPT has a long crazy memory (10K tokens) because you sandwich stuff around some trivial 9.5k tokens of repetitions. They likely added a bunch of defenses against different basic prompt engineering attacks so people dont get it to say certain things too.
zzzthelastuser t1_izx8k9l wrote
> I've seen discussion on whether they use some kind of summarization to be able to fit more context into the same
They could unironically use ChatGPT for this task.
master3243 t1_izxkwzt wrote
True, using the embedding from an LLM as a summary of the past for the same LLM is a technique I've seen done before.
Viewing a single comment thread. View all comments