Submitted by blacklemon67 t3_11misax in MachineLearning
endless_sea_of_stars t1_jbmda5p wrote
Reply to comment by bivouac0 in [D] Why are so many tokens needed to train large language models? by blacklemon67
> develop a method to separate knowledge retention and language pattern modeling. Think about learning the state capitals. A person quickly learns to say "the capital of X is Y" and then can substitute in different memorized facts. AI learns the facts and the sentence patterns all in the same manner.
This sounds like a problem Toolformer is supposed to address. Instead of learning all the state capitals learn to call. "The capital of Indiana is [QA(Indiana, capital)]."
Viewing a single comment thread. View all comments