petkow t1_jednuvp wrote on March 31, 2023 at 7:01 AM

Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679

A possibly naive question of mine, but are custom "fine tuned" models, - similar to that in this post - the only way for instruction following LLMs to ingest (larger scale) new knowledge from sources which were not included in the original training set?

Let's say for example - I want to summarize some larger scientific article or a larger book (above 50-100 pages) or multiple user interview transcripts for a corporate use-case with an LLM that has a similar response quality to that of GPT-4. Due to token limitations, these can not be put into the prompt directly, if I am correct. The new ChatGPT plugins (I still do not have access to it), will not solve that either, as they can only query some external knowledge source (retrieval plugin, web plugin), but this will just result in a keyword based query and an already truncated query result ingested into the prompt. So summarizing one new comprehensive corpus beyond the token limits needs a new model trained with that added corpus into the training set? Can you provide recommendations for that, what is the most efficient way?