Recent comments in /f/deeplearning

deepForward t1_je0ktqx wrote

Try the easy way first :

Build a model that only learns chairs, with all labeled chairs you have and ignore anything else at first.

Try also image data augmentations and see if it helps.

You are not looking at having the best score, actually you dont care about your score as long as you can label new chairs.

You mostly want to tune the model so that you don't have false positives (and introduce noise in your labels). False negatives are OK, and will occur if you tune the model so that FP are zero. You can tune for instance the threshold on a confidence score or class probability (check the model you're using).

You can also build a basic image validation tool with jupyter notebook widgets, steamlit, or your favorite tool, if you want to validate quickly by hand that they are no false positives. It's a very good exercise.

Good luck !

1

qphyml t1_jdz7ndt wrote

I think you can do it both ways (with or without filtering) and compare. Just speculating now, but the filtering could potentially affect the performance on the other classes (since you change the model’s training path for those classes ). But my guess is that that should not be a big issue, so I’d probably go about it the way you described if I had to pick one strategy.

1

BellyDancerUrgot t1_jdx6w01 wrote

The implication was, most of accessible textual data. Which is true. The exaggeration was such cuz it’s a language model first and foremost and previous iterations like gpt3 and 3.5 were not multimodal. Also , as far as accounts go, that’s a huge ‘?’ atm. Especially going by tweets like these

https://twitter.com/katecrawford/status/1638524011876433921?s=46&t=kwpwSgfnJvGe6J-1CEe_5Q

The reality is , we and you don’t have the slightest clue regarding what it was trained on and msft has sufficient compute to train on all of the text data on the internet.

When it comes to multimodal media we don’t really need to train a model on the same amount of data required for text.

1

ChingChong--PingPong t1_jdwfooc wrote

It was not trained on basically the entire internet. Not even close. Even if they trained it on all the pages Google has indexed, that's not even close to the entire internet, and I'm not even talking about the dark web. Toss in all the data behind user accounts, paywalls, intranets. Then toss on all the audio and video on all the social media and audio/video platforms and OpenAI couldn't afford to train, much less optimize, much less host a model of that size.

1

AI-without-data OP t1_jdvhnjr wrote

Thank you. But I don't understand it clearly.

Do people train the model in that way as well? In the COCO dataset, some images contain objects that are not labeled but are listed in the classes.

If people follow your suggested method for training the model, they would need to first filter out images with perfectly labeled objects (no missed labels) from the COCO dataset and use that filtered data to train the model. Then they would need to run the model on the remaining data to obtain labels for objects that are not included in the dataset, and update the entire dataset accordingly. Is this correct?

1