Likelihood of OpenAI moderation flagging a sentence containing negative adjectives about a demographic as 'Hateful'. Submitted by grungabunga t3_11bb3l3 on February 25, 2023 at 3:45 AM in singularity 127 comments 140
Spire_Citron t1_ja0457j wrote on February 25, 2023 at 9:06 PM Reply to comment by TheRidgeAndTheLadder in Likelihood of OpenAI moderation flagging a sentence containing negative adjectives about a demographic as 'Hateful'. by grungabunga The training data is massive and usually not carefully curated because they need so much of it. Permalink Parent 4 starstruckmon t1_ja1102i wrote on February 26, 2023 at 1:09 AM He's talking about the human preference data used for RHLF fine-tuning ( which is what makes ChatGPT from GPT3 ). It's not really that massive. Permalink Parent 1
starstruckmon t1_ja1102i wrote on February 26, 2023 at 1:09 AM He's talking about the human preference data used for RHLF fine-tuning ( which is what makes ChatGPT from GPT3 ). It's not really that massive. Permalink Parent 1
Viewing a single comment thread. View all comments