[OC] Research Funding vs Human Development: a country's R&D spending correlates with its societal well-being
Submitted by latinometrics t3_125hq7r in dataisbeautiful
Reply to comment by ProLibertateCH in [OC] Research Funding vs Human Development: a country's R&D spending correlates with its societal well-being by latinometrics
You have have causation without correlation.
I've heard this but not entirely how it would work, care to elaborate? Is it because there may be another different variable which cancels out the causality effect leaving non observable correlation?
That’s one case, yes. Think of a situation where the causal effect is only responsible for a small portion of the observed outcomes, so the gross correlation could even run backwards due to other factors despite known causation in part of the sample.
Or there could be other causal factors stemming from the original cause that push back the observed correlation which need to be accounted for. Certain genes are known to cause breast cancer at such a high rate that everyone who screens positive for them might choose to get double mastectomies, causing breast cancer rates to actually fall among that group. I don’t think this is actually true, but it’s a hypothetical example of a case where no or even reverse correlation might exist despite known and strong causation.
That's amazing, thanks! Do you happen to know any good subreddit about stats and whatnot?
[removed]
Not in particular to stats, no, but I also haven’t really looked for something like that.
Maybe the only place I know that might have a higher than usual concentration would be /r/slatestarcodex, which has some small overlapping interest in Bayesian reasoning and is generally more interested than other communities in using stats accurately rather than just as a tool to prove pre-determined point (sometimes).
People who take antidepressants are more depressed than people who do not. If we just look at the correlation, we might assume that antidepressants cause depression, but the opposite is true.
In this case, there's still a correlation, but the sign is the opposite of the true causal effect of taking antidepressants.
Alternatively, consider a car being driven over a hilly road at a constant speed. When the car is going uphill, it's burning more gas. There's no correlation between speed and gas consumption, but gas consumption increases speed.
Though only in a dataset of 1, right? Or is there something I'm missing?
No, you can have it in datasets of any size. If X causes Y, but also it just happens that in your dataset there's some other factor Z that causes (not Y) and happens to correlate strongly with X (in your dataset). For example, if exposure to some substance causes cancer, but people who are exposed to that substance tend to be exposed to vast quantities of it that kill them immediately (thereby preventing the vast majority of them from living long enough to develop cancer), you'd have a definite causation, but no (or even a reversed) correlation.
Good point! I think was more considering this within the context of a single dataset, without outside knowledge. If I read your example correctly (please correct me if I don't), the dataset described above would not contain evidence of X causing Y or if the dampening effect of Z is not complete X would correlate with Y (though perhaps weakly). Thought there may again be something I'm missing ;)
Yeah, exactly.
Viewing a single comment thread. View all comments