Viewing a single comment thread. View all comments

EdenistTech t1_j097uyn wrote

Hello. I have binary classification problem. However, instead of aiming for a high overall prediction rate for the entire training set, I would like to find subsets of features that with a very high probability places a given sample in category X and other subsets that place samples in category Y. In other words a prediction should not be attempted if the conviction of the estimate is low. Does such an algorithm exist?

1

drewfurlong t1_j0a3zn7 wrote

Would you say you're looking for a classifier with high precision, and perhaps low recall?

2

EdenistTech t1_j0aa5is wrote

To some extent yes. But rather than focusing on the true positives of the entire training set, I would be interested in the algorithm carving out subsets of features and values for which precision is very high - higher than the precision of the entire training set. I hope that makes sense?

1

onionhead888 t1_j0j7xza wrote

I think you’re looking for random forests which is a binary split algorithm.

2