Friday, October 08, 2010

Correlation vs Causality

I was in Chennai a few years ago at a conference sitting atop the terrace of a restaurant with a few colleagues. The sales of cold drinks were at an all time high with everybody ordering coke ,beer etc.
At the same time, I noticed that there was a huge influx of patients at the hospital  near the restaurant .
So there must have been a positive correlation between the data for the sale of cold drinks and the data for the inflow of patients to the hospital , meaning as one went up or down the other went up or down too.
However does it mean that the  cold drinks caused people to get hospitalized ? Or vice versa - did people
drink because someone got hospitalized ?
None of the above was entirely correct in this situation. The reality was , the influx of people to the hospital and the sales of cold drinks were  caused by the sweltering heat of Chennai.. So if we were to forecast the sale of cold drinks ,  the causal factor would be  "temperature"  and not the "number of people admitted to the nearby hospital" . 
And this precisely  is one of the key  things to watch out for in analyzing the results of  regression analysis. While regression will give you correlation between 2 variables  , it may require an expert to confirm  if there is causality between the two.

Wednesday, October 06, 2010

The dangers of serial thinking

I bought a horse for 10$ and sold it to a friend for 20$. I bought the same horse from the same person for 30$ and sold it to the same person for 40$ . What was my profit ?

If you came up with 10$ as the answer  (20-10) + (20-30) + (40-30)  , you fell into the trap of serial thinking.

If you came up with 20$ as the answer , you are right because these two are separate transactions.

Transaction 1 - You bought and sold the horse. Profit = 20 - 10 = 10$
Transaction 2 - You bought and sold the horse. Profit = 40-30 = 10$

Total profit = 10 + 10 = 20$

If  you are financially savvy  , you would do

profit = total cash inflow - total cash outflow
         = (40 + 20) - ( 30 + 10)
        = 60 - 40
        = 20$

Beware  the dangers of serial thinking .
Start  thinking laterally.