Bayes’ Theorem
Psyc 339 02/07/2006
Outline
Sample problems Bayes’ Theorem History of Bayes’ Theorem Applications Practice questions
One Example
This is a hypothetical problem Suppose 1% of the population has HIV A test for HIV is said to be 90% accurate
If a person has HIV, the test can identify 90% of the times correctly
If a person does not have HIV, the test can reject 90% of the times correctly.
If someone gets back a positive test, what’s the probability that he/she has HIV? Choices: A. Below 10%; B. 10%~30%; C. 30%~50%; D.50%~70%; E. 70%~90%; F. 90%; G.90%~100%
One approach to calculate…
Consider 1000 people
0.01
0.1
Negative: 1
0.9
Positive: 9
HIV: 10 108
0.99
No 0.1 HIV: 990 0.9
P(H|+)=9/(99+9)=0.083
Positive:99
Negative: 891
Or, look at the problem this way… Test +
People with HIV
Hit 90%
People without HIV
False Alarm 10%
Test -
Miss 10% Correct Rejection 90%
Bayes' Theorem Hit Rate
Base Rate
p(+ | H) p(H) p(H | +) = p(+ | H) p(H) + p(+ | H) p(H) False Alarm Rate
So, another approach to calculate…
(0.9)(0.1) p(H | +) = = 0.083 (0.9)(0.1) + (0.1)(0.99) Bayesian Calculators: • http://psych.rice.edu/online_stat/ • http://psych.fullerton.edu/mbirnbaum/calculators/Bayes Calc.htm
Base Rate Neglect
In the HIV example, if your guesstimate is way over 8.3%, then you probably neglect the base rate.
Base rate neglect is a persistent phenomenon in which people do not place sufficient weight on the probabilities of occurrence of relevant events.
Bayes’ Theorem: A more general form
p(H & D) p(D | H) p(H) p(H | D) = = p(D) p(D) p(H|D): the probability of H given D p(H&D): the probability of H and D together p(D): the probability of D (including all the possibilities that D can happen)
Monty Hall Problem
Suppose you are on a game show, and you’re given a choice of three doors.
Behind one door is a car; behind the others, goats.
You pick a door, and the host, who knows what’s behind the doors, offers to open a second door, which has a goat.
After that, you can switch to the third door, or you can stay with your original choice.
Do you have a better chance to win the car if you switch?
Imagine the situation…
Let us suppose that A is the door you pick first, and B and C are the two other doors.
If you don’t ask the host to open the door, your chance of winning is 1/3.
Suppose you ask the host to open a door, and he opens B, revealing a goat. Then, what is the probability of this datum Db given the three hypotheses, Ha, Hb, and Hc?
Finding P(Db)
If the car were in A, he would pick one of the other two doors at random, so p(Db|Ha)=1/2
If the car were in B, he would not pick B, so p(Db|Hb)=0
If the car were in C, he would only pick B, so p(Db|Hc)=1
Apply Bayes’ Theorem
Now we can apply the Bayes’ Theorem to calculate p(Hc|Db). p(Db | Hc )p(Hc ) p(Hc | Db ) = p(Db ) p(Db | Hc )p(Hc ) = p(Db | Ha )p(Ha ) + p(Db | Hb )p(Hb ) + p(Db | Hc )p(Hc ) (1)(1/3) = = 2/3 (1/2)(1/3) + (0)(1/3) + (1)(1/3)
For those who don’t believe…
Here is the simulation for Monty Hall problem.
http://psych.rice.edu/online_stat/
Bayes' Theorem History The theorem was named after Thomas Bayes' (1702-1761), who first recognized the importance of personal probability.
Application: Spam filter
Bayesian spam filters calculate the probability of a message being spam based on its contents.
Unlike simple content-based filters, Bayesian spam filter does not classify an email as spam rigidly.
Bayesian spam filters can also learn from spam and from good mails and returns hardly any false positives.
Other Applications:
Microsoft Office Assistant Google Search Engine Autonomy Systems Modeling how neurons behave in very complicated systems Any other applications?
Practice Question 1
A manufacturer claims that its drug test will detect steroid use (that is, show positive for an athlete who uses steroids) 95% of the time. Your friend on the basketball team has just tested positive. The probability that he uses steroids is:
A. 0.95
B. At most 0.95
C. At least 0.95
D. Not possible to say, based on the information
Practice Question 2
In a population, 70% of the people have a certain condition. A test is developed that has a 40% chance of detecting the condition in a person who has it and a 10% chance of falsely indicating it in a person who does not have it. If a person gets a positive test result, roughly what is the probability they have the condition?
Choices: 0.10; 0.25; 0.33; 0.50; 0.67; 0.75; 0.90
Practice Question 3 Toss a fair coin. If it lands head up, draw a ball from box 1; otherwise, draw a ball from box 2. If the ball is blue, what is the probability that it is drawn from box 2? Box1 p(box1) = .5 p(red ball | box1) = .4 p(blue ball | box1) = .6
Box2 p(box2) = .5 p(red ball | box2) = .5 p(blue ball | box2) = .5
Key to Q3 p(box1) = .5 P(red ball | box1) = .4 P(blue ball | box1) = .6
p(box2) = .5 P(red ball | box2) = .5 P(blue ball | box2) = .5
p(box2)p(blue ball | box2) p(box2 | blue ball) = p(blue ball) p(box2)p(blue ball | box2) = p(box1)p(blue ball | box1) + p(box2)p(blue ball | box2) .5 * .5 = = .25 = 0.4545454545... .5 * .6 + .5 * .5 .55
Any Questions?