Submitted by Zetsu-Eiyu-O t3_10q45pr in MachineLearning
MysteryInc152 t1_j6o62z6 wrote
This is what In-context learning is for.
Giving the model a few examples of a text input and a corresponding fact extraction is all that's necessary.
Zetsu-Eiyu-O OP t1_j6o9sz5 wrote
so just tokenize the output and penalize it against that? so do all generative models (for example gpt2) can learn this output the same way?
MysteryInc152 t1_j6okowf wrote
Not sure what you mean by penalize but say you wanted an LLM that wasn't instruction fine-tuned to translate between 2 languages it knows.
Your input would be
Language x: "text of language x"
Language y: "translated language x text"
You'd do this for a few examples. 2 or 3 should be good. Or even one depending on the task. Then finally
Language x: "text you want translated"
Language y: The model would translate the text and output here
All transformer generative LLMs work the same way with enough scale. GPT-2 (only 1.5b parameters) does not have the necessary scale.
Zetsu-Eiyu-O OP t1_j6oztfw wrote
Oh, I see thanks, I have a few questions about the basics of training a large language model, do you mind if I shoot you a message?
MysteryInc152 t1_j6p0ipa wrote
Sure
Zetsu-Eiyu-O OP t1_j6p49di wrote
thank you so much! I will drop you a message once I'm at my desk.
Viewing a single comment thread. View all comments