Viewing a single comment thread. View all comments

Immarhinocerous t1_j3g5qgb wrote

What do you want to do with it?

For tabular data of a few million rows or less, you're often much better off using XGBoost, or one of the other boosting libraries. Read up on boosting. It is an alternative approach to deep learning. Technically it can also be used with neural networks, including deep learning, but in practice it is not often used with it because boosting relies on multiple weak learners whereas deep learning has long training times for the creation of one strong learner.

XGBoost and CatBoost have won many many Kaggle competitions. My former employer trained all their production models using XGBoost, and they were modeling people's credit scores. There are many reasons to use XGBoost including speed of training (much faster to train than deep neural networks) and interpretability (easier to interpret the model's decision-making process with XGBoost, because under the hood it's just decision trees).

I mostly use XGBoost, and sometimes fairly simple LSTMs. I use them primarily for financial modeling. XGBoost works well and the fast training times let me do optimization across a wide range of model parameters, without spending a bunch of money on GPUs.

If you want to do image analysis though, you do need deep learning for state-of-the-art. Ditto reinforcement learning. Ditto several other types of problems.

So, it depends.

1

ForceBru t1_j3gfgvo wrote

> Read up on boosting.

What could a good reading list look like? I read the original papers which introduced functional gradient descent (the theoretical underpinning of boosting), but I can't say they shed much light on these techniques for me.

Is there more recommended reading to study boosting? Anything more recent, maybe? Any textbook treatments?

1

Immarhinocerous t1_j3gkq83 wrote

Google is your friend here. ChatGPT may even give a decent response.

Start by learning bagging, then learn boosting.

I find the following site fairly good: https://machinelearningmastery.com/essence-of-boosting-ensembles-for-machine-learning .

The explanations are usually approachable, or the author usually has another article on the topic at a simpler level of detail. He has good code samples, many of his articles also go quite in depth, so he caters to a broad range of audiences and is good at sticking with a certain level of depth in a topic in his articles. I've even bought some of his materials and they were fairly good, but his free articles are plenty.

There are lots of other sites you can find that will teach you. Read a few different sources. They're worth understanding well. Since you stated you've read papers on gradient descent, you might find some helpful papers by searching scholar.google.com.

This is also a good place to start: https://www.ibm.com/topics/bagging

1