endless_sea_of_stars t1_jbmda5p wrote on March 10, 2023 at 1:53 AM

Reply to comment by bivouac0 in [D] Why are so many tokens needed to train large language models? by blacklemon67

> develop a method to separate knowledge retention and language pattern modeling. Think about learning the state capitals. A person quickly learns to say "the capital of X is Y" and then can substitute in different memorized facts. AI learns the facts and the sentence patterns all in the same manner.

This sounds like a problem Toolformer is supposed to address. Instead of learning all the state capitals learn to call. "The capital of Indiana is [QA(Indiana, capital)]."