Submitted by AutoModerator t3_zcdcoo in MachineLearning
BackgroundFeeling707 t1_j05lywf wrote
Hi, How do local language model inferencing such as kobold ai's webui keep information? I understand you can only produce a certain number of tokens in one go.
Does it just use the last 30 tokens or so in the new batch?
Eventually. I run out of memory.. Unable to continue the text adventure. It Shouldn't do that right?
Are there techniques to store info?
Viewing a single comment thread. View all comments