Viewing a single comment thread. View all comments

master3243 t1_izv48yc wrote

This is it, they have a huge context size and they just feed it in.

I've seen discussion on whether they use some kind of summarization to be able to fit more context into the same size model but there's only speculation in that regards.

In either case, it's nothing we haven't seen in recent papers here and there.

114

maxToTheJ t1_izvltcw wrote

It probably does some basic checks for adversarial text like putting AAAAAAAAA*, BBBBBBBBBBBBB*, [[[[[[[[*, or profanity profanity profanity then preprocesses the text before inputting.

EDIT: Only mentioning since some folks will argue chatGPT has a long crazy memory (10K tokens) because you sandwich stuff around some trivial 9.5k tokens of repetitions. They likely added a bunch of defenses against different basic prompt engineering attacks so people dont get it to say certain things too.

17

zzzthelastuser t1_izx8k9l wrote

> I've seen discussion on whether they use some kind of summarization to be able to fit more context into the same

They could unironically use ChatGPT for this task.

3

master3243 t1_izxkwzt wrote

True, using the embedding from an LLM as a summary of the past for the same LLM is a technique I've seen done before.

1