Submitted by noellarkin t3_10rhprm in MachineLearning
To clarify, I'm not talking about ChatGPT here. I've been testing outputs from GPT-3 davinci003 against alternatives in terms of output quality, relevance, and ability to understand "instruct" (versus vanilla autocompletion).
I tried these: AI21 Jurassic 178B NeoX 20B GPT J 6B FairSeq 13B
As well as: GPT-3 davinci002 GPT-3 davinci001
Of course, I didn't expect the smaller models to be on par with GPT-3, but I was surprised at how much better GPT3 davinci 003 performed compared to AI21's 178B model. AI21's Jurassic 178B seems to be comparable to GPT3 davinci 001.
Does this mean that only well-funded corporations will be able to train general-purpose LLMs? It seems to me that just having a large model doesn't do much, it's also about several iterations of training and feedback. How are open source alternatives going to be able to compete?
(I'm not in the ML or CS field, just an amateur who enjoys using these models)
visarga t1_j6x8zna wrote
I think open source implementations will eventually get there. They probably need much more multi-task and RLHF data, or they had too little code in the initial pre-training. Training GPT-3.5 like models is like a recipe, and the formula + ingredients are gradually becoming available.