EdenistTech t1_j097uyn wrote on December 14, 2022 at 11:45 PM

Hello. I have binary classification problem. However, instead of aiming for a high overall prediction rate for the entire training set, I would like to find subsets of features that with a very high probability places a given sample in category X and other subsets that place samples in category Y. In other words a prediction should not be attempted if the conviction of the estimate is low. Does such an algorithm exist?

drewfurlong t1_j0a3zn7 wrote on December 15, 2022 at 3:47 AM

Would you say you're looking for a classifier with high precision, and perhaps low recall?

EdenistTech t1_j0aa5is wrote on December 15, 2022 at 4:39 AM

To some extent yes. But rather than focusing on the true positives of the entire training set, I would be interested in the algorithm carving out subsets of features and values for which precision is very high - higher than the precision of the entire training set. I hope that makes sense?

onionhead888 t1_j0j7xza wrote on December 17, 2022 at 1:22 AM

I think you’re looking for random forests which is a binary split algorithm.