Sunday, September 26, 2010

Bayes theorem

Bayes theorem is the foundation of Bayesian statistics and the new and emerging discipline of  predictive analytics. Reverend Bayes , a 17th century British mathematician wrote the Bayes theorem as a special case of probability theory. He did not disclose the theorem  fearing that it might not pass rigorous scientific scruitiny. The theorem was recovered after his death .
Here's the theorem and  a  few practical applications.

P(Ri/E) = p(E/Ri) p(Ri) / sum [p(E/Ri) * p(Ri)] i = 1 to n

p(Ri) is called anterior probability  of event Ri. It represents what we already know from past history about event Ri..E is the fresh new evidence that has arrived that would influence the event Ri.

p(Ri/E) is the probability of event Ri given the evidence E.
p(E/Ri) is the likelihood of the evidence itself.

As can be seen above as new evidence surfaces, the theorem lets us update our knowledge about the probability of occurrence of the event Ri. p(Ri) was our knowledge of the event based on its history,
while p(Ri/E) is our updated knowledge after taking the evidence E into consideration.

These probabilities can be deterministic or may represent a certain distribution.
Here are a few practical examples  where the theroem could be applied.
(An example similar to the first example below was  cited in a recent issue of Sloan Management Review).

1.I know the history of rainfall in my region for the past 10 years. I now have the evidence that this year
the temperatures were higher than normal. We know that higher temperatures correspond to higher rainfall
and follows a certain distribution. Using this evidence, we can predict the probability of rains this year  .
If I calculated probability based on history alone (which is what frequentists would do)  , I would be ignorning the key evidence that surfaced this year.

2. We know the history of deliveries of a certain supplier. A new evidence has arrived that the supplier's capacity is full due to a new contract they have signed with some customer. We know the relationship between the supplier's capacity and on time delivery . Given the evidence, we can find out the probability of on time delivery this month. If we had not used Bayes theorem, (and instead relied  on history alone as a frequentist would do)  the probability would have been based solely on the supplier's  historical performance and hence would have ignored the critical new evidence.

Bayesian theorem does have certain limitations/gotchas  in practical life as would be seen in future blogs.

No comments: