我的小小天地。
此间纪录我的爱,我的生活,我的故事。
想要写什么怎么写一切随心随性随意,唯有一点,能进来的只有爱。


2017年1月14日星期六

Thinking 6-Probability

Probability:

Bayesians view-Probability refers to a subjective degree of confidence, and because one can express confidence to that a single event will occur one can express the probability of a single event.
Frequentists view-Probabilityis always defined over a reference class such as an infinite number of coin tosses. Since single events do not belong to a reference class it cannot have a relative frequency or probability.

**When  psychologist ask whether people's reasoning about probabilities is normative, they are asking whether the output of the system is the same that would be returned by a Bayesia machine, not whether the internal process is Bayesian.

Bayes' Theorem:
->Researches have been conducted with reference to a single event probabilities/posterior probability, which is the probability of an hypothesis given the data--p(H|D).->p(H|D)=p(H)*p(D|H)/p (D)    p(D)=p(H)*p(D|H)+p((¬H) * p(D|¬H)
->Bayes' Theorem of the normative theory of probability, that man is apparently not a conservative Bayesian and our minds are not build to work by the rules of probability.
->From the heuristics & biases, we could see that people do not reason about probabilities normatively. Many psychologists but not evolutionary psychologists accept this and provide explanations as to what this might be.
->To evolutionary psychologists, deviations are due to the formulation of the problem not heuristics. We can reason according to Bayes' theorem if the information is presented in a format that we have evolved to process.
->Evolution Psychology & Probability:


->There is no evolutionary reason why Bayesian reasoning should not have developed. Bayesian inferences could be shown through even simple sea slugs exhibit habituation and certainly all vertebrates can be classically conditioned.


Frequentist mind:
->The number of events indicates the reliability of the decision is retained in frequency format but not in probability format.
->Permits easy updating as new information is collected.
->Reference classes can be constructed post-hoc as the reference class changes.


Base rate neglect:
1. A cab was involved in a hit and run accident at night. Two cab companies, the Green and the Blue, operate in the city. You are given the following data:
  • 85% of the cabs in the city are green and 15% are blue.
  • A witness identified the cab as blue. The court tested the reliably of the witness under the same circumstances that existed on the night of the accident and concluded that the witness correctly identified each one of the colours 80% of the time & failed 20% of the time.
  • What is the probability that the cab involved in the accident was blue rather than green?
  • Prior probability=0.15 (probability of blue cabs), Witness hit rate= 0.8 (identified correct colour), witness false alarm rate=0.20 (failed to identify correct colour)
  • Witness sees a green taxi and mistakenly call it blue=0.85*0.2=0.17, Witness sees a blue taxi can correctly identifies it=0.15*0.8=0.12.
  • Given the witness reports a blue cab, which could happen 0.17+0.12=0.29 of the time, the probability that it was a blue taxi=0.12/(0.12+0.17)=0.41
People tend to think that if was more likely for the taxi to be blue than green (p>0.50), and many say that p=0.80.
Judges focus on the witness' accuracy and neglect the base rate of cabs in the city. This error was attributed to the representative heuristics.

2. The medical diagnosis problem
  • If a test to detect a disease whose prevalence is 1/1000 has a false positive rate of 5%, what is the chance that a person found to have a positive result actually has the disease? Assume that you know nothing about the person's symptoms or signs.
    • p(sick)=0.001, p(healthy)=0.999, test hit rate=1, test false alarm rate=0.05
    • One is sick and have a positive test=1*0.001=0.001, one is healthy but has a positive test=0.999*0.05=0.04995. Probability of positive test=0.001+0.04995=0.05095
    • Chance of one is found to have a positive result and actually have the disease=0.001/0.5095=0.019627.
    • One average when 1000 person is tested, 1 is sick and it's certain that he/she has a true positive test result. The other 999 person are healthy, but there is a 5% false positive result, that 49.95 of them, receive a false positive result.
    • There are 51 positive result in total but only 1 of them is true and the person is sick (p=0.019267).
    • 45% answered 95% (ignores the base rate), 18% answered 2% correctly with Bayesian inference.
  • Cosmides & Tooby conducted a study in other probability and frequency formats.
    • Frequentist with redundant % information: 1 out of every 1000 Americans has disease X. A test has been developed to detect when a person has disease X. Every time the test is given to a person who has the disease, the test comes out positive (i.e., the "true positive" rate is 100%). But sometimes the test also comes out positive when it is given to a person who is completely healthy. Specifically, out of every 1000 people who are perfectly healthy, 50 of them test positive for the disease (i.e., the "false positive" rate is 5%). Imagine that we have assembled a random sample of 1000 Americans. They were selected by a lottery. Those who conducted the lottery had no information about the health status of any of these people. Given the information above on average how many people who test positive for the disease will actually have the disease?
    • Most people get it correct at 2%. The data were interpreted as modules however the modules cannot reason about probabilities because it is domain specific (the input can only be accepted in a frequency format).
-->If the probability problems are asked in a frequency format the base rate neglect can be reduced.
-->Eg. the conjunction fallacy revisited ("Linda is a bank teller" is more chosen in the frequency version), the Monty-Hall revisited (higher switch rate in frequency formats than probability formats).


Preference Reversals 
->A key phenomenon that violates RCT.
  • In an experiment, participants are presented with pairs of monetary games/bets, which contain the possibility of winning and losing certain amounts. One bet is relatively safe and has a high possibility of winning a small amount (P-bet), while the other one is more risky and has a small possibility of winning a large amount ($-bet). Participants were asked which of the bets they would like to play, and provide a monetary value for each one when presented individually.
  • Participants prefer the P-bet over the $-bet in the choice phase, but rate the $-bet with a higher monetary value than the P-bet.
  • The monetary value represents its utility, giving a higher rating representing a reversal of participants preference. The inconsistent behaviour appears to be a strong phenomenon and suggests that human irrationality is systematic and widespread.
  • Tunney examined if reversals are finished when presented as frequencies. In fact, the reversals reduced by 20-30% when presented in frequency format.

Why do frequencies elicit normative reasoning?

>Gigerenzer & Hoffrage argue that Bayesian computations are simpler when presented in a frequency format. We only need to store that absolute frequencies of (D|H) and (D|¬H). Base rates are not needed and this is why the base rate neglect when they are needed in probability formats.
>When the questions are posed in frequentist terms they reason normatively. Humans & animals encode information about uncertain probabilities with natural frequencies than probabilities. Bayesian computations are simpler when information is represented in natural frequencies.


















**Bayesian Reasoning can be learned in three ways: rule training, frequency grid and frequency tree. All three methods results in improvement however the rule training could experience a performance decline after a retention interval (began to neglect base rates again). Both the frequency grid and the frequency tree are realisations of natural sampling of frequencies.




没有评论:

发表评论