Submitted by Not-Banksy t3_126a1dm in singularity
Scarlet_pot2 t1_je92iud wrote
Reply to comment by ActuatorMaterial2846 in When people refer to “training” an AI, what does that actually mean? by Not-Banksy
Most of this is precise and correct, but it seems like you say a transformer architecture is the GPUs? The transformer architecture is the neural network and how it is structured. It's code. The paper "attention is all you need" describes how the transformer arch. is made
After you have the transformer written out, you train it on GPUs using data you gathered. Free large datasets such as "the pile" by eluther.ai can be used to train on. This part is automatic.
the Human involved part is the data gathering, data cleaning, designing the architecture before the training. then after humans do finetuning / RLHF (reinforcement learning though human feedback).
those are the 6 steps. Making an AI model can seem hard and like magic, but it can be broken down into manageable steps. its doable, especially if you have a group of people who specialize in the different steps. maybe you have someone who's good with the data aspects, someone good at writing the architecture, some good with finetuning, and some people to do RLHF.
Viewing a single comment thread. View all comments