我的小小天地。
此间纪录我的爱,我的生活,我的故事。
想要写什么怎么写一切随心随性随意,唯有一点,能进来的只有爱。


2017年1月13日星期五

Thinking 5-Learning & Choice

The Monty-Hall's three doors problem
=>Contestants were to choose one of three doors which one of the doors hides a grand prize and the remaining two doors hide a booby prize.
=>The contestant must guess which door contains the grand prize but they cannot open the door to see of they are correct.
=>Out of the two not chosen doors, one would be opened to reveal a booby prize.
=>The contestant is given a chance to choose either stick with the original door or switch to the remaining door. this second choice poses the problem. Friedman(1998) found that contestants rarely switch and nearly always stuck with their original choice.
=>Marylyn Vos Savant: the correct choice was always to switch.

>>The initial chance to win the grand prize was 1/3.
>>However, with the strategy of always switching, the chance or getting a booby prize was reduce to 1/3.
>>In other words, the grand prize could be won at a 2/3 chance.








But, why do people stick?
--In laboratory studies (Granberg, 1999; Granberg, 1995), around 80-90% of participants who act as the contestant would stick to their initial choice.
--Most people stick with their original choice due to the cognitive illusion presented by the dilemma, which participants believe that the odds of winning the grand prize by either switching or sticking are 50:50.
--Participants reported that they would feel worse if they had switched and lost than they had stayed with their original choice and lost. The Regret Theory?

--Gilovich (1995) asked participants to rate the value of the booby prize in his similar experiment. Those who switched assigned a higher monetary value to the booby prize than those who stayed with their initial choice.
--Thus, the subjective EU for making the wrong choice differs according to whether the error is desirable.
--This suggested that the framing of the problem could influence the choices people make.

The Russian Roulette Dilemma
=>The counterpart to the Monty Hall Dilemma
=>One door conceals a terminal loss and the other door do not.
=>The optimal strategy is to stick with the original choice to avoid the loss.
=>In this version, participants also tend to choose the sub-optimal alternative which is switching than the optimal alternative of sticking.

Learning to switch:
->People actually do possess the normative processes learn to switch for optimum outcome, when they are allowed to play the game on successive occasions for real rewards.
->Friedman (1998) showed that the proportion of participants who switched increased from initially <10% to around 30% at the end of the session. Only 6 participants switched more than half of the time.
->The participants were divided and received one of four treatments.
     =>The incentives group receive larger financial rewards and penalties.
     =>The The track record group were required to record the outcome of each round and their strategies.
     =>The advice group received conflicting explanations about why switching or sticking was always best.
     =>the compare group were statistics that 60% of switch choices won the grand prizes compared to only 30% of the stick choices.
->Each group displayed a steady increase in the number of switch choices, rising from 40% to 53%. The trends suggest that when given sufficient rounds the normative benchmark could be reached.

Choosing Anomalies:

-->If human possess the cognitive architecture to be rational, then every choice every choice anomaly(abnormality) can be greatly diminished or entirely eliminated in appropriate structured learning environments.
-->Could reduce problems that people consistently fail to maximising EU, including probability matching and melioration.

-->Probability Matching
     --Even after large amount of learning, one's asymptotic behaviour is not the optimal behaviour (not even in a limit reach the optimal behaviour).
     --PM is the prediction of the of class membership is proportional to the class base rates.


>>The left light turn on 70% of the time, people will predict the next bulb going to turn on next would be 70% for the left and 30% for the right.
>>In that case, judges predict by matching the probabilities of the events 0.7*0.7+0.3*0.3, they have 58% of chance to guess the next light bulb correctly.
>>The optimal strategy would be consistent choosing the left light, which would lead to a success rate of 70% but not the 50% as we usually thought.
>>Participants reported the best strategy was to observe the frequencies of each light (calculating reinforcements) and then make matching predictions, which is what most of the participants did.




     --Probability matching can be eliminated in appropriately structured learning environment (provided with payoff and feedback upon behaviour) and in this case humans' decision making is rational/optimisation under constraints.


-->Melioration
     --Derived from the Law of Effects as a theory of the Matching Law
     --Decision based on the reinforces it receives & invest more time and/or effort into whichever better alternative.
     --Any rise/fall in the reinforcement of a response would cause the rate of occurrence of the response to change in the same direction.
     --One will keep on switching to better alternatives that currently has the highest response ratio, regardless of the effect on the overall rate of reinforcement.
     --Hernstein et al. examined factors that might determine whether human participants adopt a meliorating or maximising strategy, including amount of reward, the length of delay until a fixed reward is received, the percentage of left/right side choices in a recent time window etc.
     --People were better at maximising when the amount of a reward changed, not as good when the delay until a reward changed. Also better at maximising when the time window that affected the relative value of the choices was smaller.

-->Probability and melioration can be mostly removed in appropriately structured learning experiments.
-->But impulsive behaviour is relative difficult to abolish for ratio schedules.




没有评论:

发表评论