Acceptable-Cress-374 t1_j4m7mee wrote on January 16, 2023 at 6:32 PM

Reply to comment by avocadoughnut in [D] Fine-tuning open source models on specific tasks to compete with ChatGPT? by jaqws

> Their current goal is to develop interfaces to gather data, and then train a model using RLHF

Potentially naive question, as I don't have much experience with LLMs. Has anyone tried using existing SotA (paid) models like davinci / gpt3 instead of RLHF? They seem to be pretty good at a bunch of focused tasks, especially in few-shot. Does that make sense?

avocadoughnut t1_j4mci2y wrote on January 16, 2023 at 7:01 PM

ChatGPT is GPT3 + instructional finetuning + RLHF for alignment. If you're talking about using those models ro gather training data, that's against OpenAI TOS, so I've heard. The goal is to make something that isn't closed source, something you can run yourself.

sad_dad_is_a_mad_lad t1_j4ohl7t wrote on January 17, 2023 at 3:38 AM

I don't think there are any laws that protect their data in this way, except perhaps contract law because they have a hidden ToS that you have to accept to use their service. As long as you use it for free though, I'm not sure there is consideration, and well... I don't know how they would go about proving misuse or damages.

Certainly it would not be copyright law, given that GPT3 itself was trained on copyrighted data...

Zondartul t1_j4mb6rm wrote on January 16, 2023 at 6:53 PM

So using a big network to teach a small network? That's a thing people do. See teacher-student learning, and distillation.

Acceptable-Cress-374 t1_j4pacws wrote on January 17, 2023 at 8:30 AM

> See teacher-student learning, and distillation.

Thanks, I'll check it out.