- Individual differences-psychological traits/characteristics
- Ability-Intelligence tests
- Attitudes-Measure particular beliefs toward something
**Ethics/moral: These tests are now used so widely, it is important to assess that they are reliable and valid that they are measuring what they claim to measure. The test should be ensured to avoid biases, and everyone has an equal opportunity of understanding the tests.
Creating your questionnaire
1. Question formats
- Open format questions
- asks for some written detail, but has no determined set of responses
- Advantages: leads to more qualitative data
- Disadvantages: time consuming to analyse
- Closed format questions
- short questions or statements followed by a number of options.
2. Theoretical literature
- Theoretical literature: ideas that appear in the theoretical literature should be used as a basis
- Experts: recruit experts in the area to suggest items
- Colleagues: brainstorming to generate more items
3. Clarity of questions
- questions must be clear, short and unambiguous
- the psychometric test question must not mean different things to different respondents
4. Avoiding leading questions
- the question should not be leading the respondent in particular direction by potentially excusing the behaviour
5. Reverse wording
- how to encourage the participant to read each question, not just get into a pattern of responding to all the questions in the same way.
- reverse wordings allow people to really pay attention
6. Response formats
- dichotomous scales: yes/no true/false
- frequency: always/sometimes/never scale
- attitude scales: strongly agree-strongly disagree
- numerical sales: to what extend the statement describes you
7. Instruction
Classical theory of error in measurement
OBSERVED score=TRUE score+ERROR
-Any score on a test for an individual on any occasion differs from his true score on account of random error.
-If we
were to test an individual on many occasions, a distribution of scores would be
obtained around his true score. The mean of this distribution, which is assumed
to be normal, approximates the true score.
-The
true score is the basis of the standard error of measurement. Thus, if we find
that there is a large variance of obtained scores for an individual, there is
clearly a considerable error of measurement. Since the test-retest reliability
is the correlation between the obtained scores on two occasions, it is obvious
that the higher the test-retest reliability, the smaller the standard error of
measurement, according to this model.
-The
classical theory of error assumes that any test consists of a random sample of
items from the (hypothetical) universe of items relevant to the trait.
-So
the most important point is that in any measurement,
there is likely to be some error involved, but in order to make a good
questionnaire, we want to minimise error, and to do this we maximise
reliability and validity.
Reliability-we are interested in consistency
1. Internal-To what extent do the individual items that make up a test or inventory consistently measure the same underlying characteristic?
- A questionnaire having high internal variability is that all of the questions hand together and measure the same thing.
- The higher the number of resulting coefficient, the more the data are close to each other, the better it is.
- Split-half reliability: split the data into two halves, if the test is reliable, there should be a high correlation between in scores between the two halves of the test.
- Parallel forms: creating a large pool pf items that are measuring the same thing, administer to the same group of participants with interval and counterbalancing.
- Cronbach's Alpha: mathematically equivalent to the average of all possible split-half estimates, values up to +1.00, usually a figure of +0.70 or greater indicates acceptable internal reliability
- Kuder-Richardson Formula 20(KR-20): measures internal reliability for measures with dichotomous choices (yes/no), values up to +1.00, usually a figure of +0.70 or greater indicate acceptable internal reliability.
2. External-To what degree does a person's measured performance remain consistent across repeated testings?
- test-retest reliability (stability over time): perform the same survey, with the same respondents at different points in time. The closer the results, the greater the test-retest reliability of the survey. The correlation coefficient between the two sets of responses is often used as a measure of the test-retest reliability.
3. Inter-rater reliability (or agreement)
- determines the extent to which two or more raters obtain the same result when coding the same response
- Cohen's Kappa: values up to +1.00, larger numbers indicate better reliability, used when there are two raters
- Fleiss' Kappa: an adaptation which works for any fixed number of raters
- Measures agreement, not accuracy
4. Intra-rater reliability (or agreement)
- The same assessment is completed by the same rater on two or more occasions. These different ratings are then compared, generally by means of correlation.
- Since the same individual is completing both assessments, the rater's subsequent ratings are contaminated by knowledge of earlier ratings.
**Sources of unreliability: guessing, ambiguous items, test length, instructions, temperature, illness, item order effects, response rate, social desirability
Validity-if it measures what it claims to measure
1. Faith
- simply a belief in the validity of an instrument without any objective data to back it up, and the evidence is not wanted.
2. Face
- If something has face validity, it looks like a test that measures the concept it was design to measure. The more a test appears to measure what it claims to measure, the higher its face validity.
- Face validity bears no relation to true validity and it is important only in so far as adults generally will not so-operate on tests that lack face validity, regarding them as silly or insulting. Face validity then, is simply to aid cooperation of subjects.
3. Content
- The extent to which a measure represents all facets of the phenomena being measured
4. Construct
- Seeks to establish a clear relationship between the construct at a theoretical level and the measure that has been developed.
- Convergent validity: That the measure shows associations with measures that is should be related to.
- Discriminnant validity: That the measure is not related to things that it should not be related to.
5. Predictive
- Assesses whether a measure can accurately predict future behaviour.
没有评论:
发表评论