Submitted by JohnyWalkerRed t3_123oovw in MachineLearning
kawin_e t1_jdxz4bh wrote
The Stanford Human Preferences dataset (SHP): https://huggingface.co/datasets/stanfordnlp/SHP
It contains pairwise preferences for posts (so tuples (post, response_A, response B)), but you can certainly turn it into an instruction dataset by only considering responses that meet a certain cut-off. I'm currently aware of one academic/industry group that is already doing this.
ninjasaid13 t1_jdy2pqq wrote
>one academic/industry group
which one?
Viewing a single comment thread. View all comments