anatomyphysiologyblog - STANDARDS SETTING FOR MCQ, OSPE & OSCE - Influenza A virus subtype H5N1 QUALITY CONTROL MEASURE IN ASSESSMENT
In this postal service dissimilar types of measure setting methods induce got been described. Among the dissimilar types, fixed percent method, Angoff's method (Angoffing) in addition to Hofstee induce got been described with their advantages in addition to disadvantages.Commonly used measure setting methods for objective structured clinical exam (OSCE) induce got too also been described.
Standards in addition to setting them:
The term ‘standards’ tin hold out used inwards a issue of ways inwards relation to testing programs. Some of the types tin hold out summarized every bit
• Eligibility standards
• Qualifications/educational requirement/other criteria
• Test delivery standards
• Administration conditions, safety procedures, technical specifications etc
• Content standards
• Outcomes/curricular objectives/specific instructional goals
• Test is prepared out of it
• Performance standards
• establishing cutting scores on a test
What is measure setting?
To seat it inwards the simplest way, measure setting refers to the physical care for of determining a cutting score for tests. Cizek (1993) has defined measure setting every bit “the proper next of a prescribed, rational arrangement of rules or procedures resulting inwards the assignment of a issue to differentiate betwixt 2 or to a greater extent than states or degrees of performance”. This definition highlights a systematic methodical means involving experts’ judgments (subjectivity) which bring into delineate of piece of job organization human relationship test’s operate in addition to content, the examinees in addition to educational setting spell determining the cutting score. Thus measure setting translates the subjectivity i.e. experts’ judgments into objectivity i.e. a numerical value inwards the cast of cutting score.
Why practise nosotros demand to measure set?
Cut score is a numerical value that represents whether an examinee meets the minimum measure laid for a item attempt out in addition to thence serves every bit the Earth for passing or failing an examinee. But the enquiry is - how practise nosotros know if the cutting scores for a given assessment are laid appropriately? For the results of an assessment to hold out credible in addition to widely acceptable, it is necessary that the cutting score is appropriately set. This makes the measure setting an of import mensuration inwards the physical care for of attempt out development.
Types of standard
Relative standards in addition to Absolute standards
1) Relative standards
Relative standards are based on a comparing alongside the performances of examinees. The standards are expressed every bit the issue or percent of examinees. In these methods, the cutting scores are laid inwards such a means that allows, for example, to transcend the sixty best performers or to discriminate to 40% from the bottom 60%. The method is appropriate for entrance examinations/selection examinations where a limited issue of candidates tin solely hold out accommodated.
2) Absolute standards
Absolute standards are based on how much the examinees know. The standards are expressed every bit issue or percent of the attempt out questions. In these methods, the cutting scores are laid inwards such a means that inwards guild to pass, examinees require producing, for example, sixty right answers out of 100 questions i.e. 60% on the test. The method is appropriate for attempt out of competence similar concluding or move out examinations, licensure in addition to certification examinations.
Methods for setting standards
Characteristics of methods of measure setting
The method of measure setting should induce got the next characteristics so every bit to ensure the credibility of the results produced past times it:
It should hold out consistent with the operate of the test
It should hold out based on skillful judgements
It should consider the mightiness of examinees
It should consider the educational setting
It should hold out defensible
It should hold out credible
It should hold out supported past times published research
It should hold out viable (easy to implement, tardily to brand others understand)
It should hold out acceptable to all stakeholders
Classification
1) Relative methods
It is based on judgments close groups of attempt out takers. e. g. fixed percent method
2) Absolute methods
It is of 2 types
• based on judgments close attempt out items. e. g. Angoff’s Method
• based on judgments close the surgical operation of private examinees. e. g. Contrasting groups methods
3) Compromise methods
It is a compromise betwixt relative in addition to absolute standards. e.g. Hofstee method
Fixed percent method
The physical care for of this method tin hold out outlined every bit follow:
• Each jurist is asked what is the percent of the examinees that volition transcend the test
• The judges tin hash out in addition to are gratis to alter the score
• The estimates are averaged to create upward one's hear the cutting score
Advantages
• Easy to use
• Suitable to position a sure as shooting issue of best (or worst) candidates
Disadvantages
• Independent of attempt out content
• Independent of how much a examinee knows
• Less reliable in addition to thence send upon the validity of the test
Angoff’s method
The physical care for of this method tin hold out summarized every bit follow:
• The borderline students are defined
• Difficulty in addition to importance of attempt out item is explained
• Each jurist estimates the proportion of borderline grouping that would respond the item correctly
• Judges hash out in addition to tin alter the rating.
• The physical care for is repeated for each item of the test.
• The judge’s estimates are averaged.
• The averages are summed upward to create upward one's hear the cutting score.
Advantages
• It focuses attending on item content, thence ensuring the validity of the item
• It is relatively tardily to use
• There is a considerable trunk of published piece of job to back upward its use
• It is best suited to tests that seek to flora competence
• It is hard to define the concept of a "borderline students"
• Judges may experience similar producing numbers out of the air
• The methods tin hold out tiresome in addition to fourth dimension consuming particularly for a long test
Hofstee method
The method tin hold out summarized every bit follow:
• Purpose of the attempt out is explained
• Nature of the examinees is discussed
• What constitutes adequate/inadequate noesis is discussed
• Each jurist estimates the following
- the minimum acceptable cutting score
- the maximum acceptable cutting score
- the minimum acceptable neglect rate
- the maximum acceptable neglect rat
Note: Items 1 in addition to 2 stand upward for absolute standards in addition to items iii in addition to iv stand upward for relative standards.
A concluding cutting score is determined afterward the attempt out is given past times plotting the scores inwards a graph.
A concluding cutting score is determined afterward the attempt out is given past times plotting the scores inwards a graph.
Advantages
• It is tardily to implement
• Judges are comfortable with the method of making estimates
Disadvantages
• The cutting score may non hold out inwards the expanse defined past times the judges’ estimates
• It is non the outset selection inwards a high stakes testing situation
Commonly used Standards setting methods for objective structured clinical examinations (OSCEs):
Angoff’s method
For each item inwards the checklist, the judges guess the proportion of borderline students that perform the item trouble correctly. Alternately, the method tin too hold out modified so that the judgment is made at the aeroplane of OSCE station rather than private item on the checklist. The guess scores are so averaged in addition to summed upward to create upward one's hear the cutting score for each OSCE station.
Borderline grouping method
During an OSCE examination, the examiners assess the surgical operation of a pupil against each item inwards the checklist every bit good every bit assign a global rating (pass/fail/borderline) based on the overall surgical operation of the pupil at the station.
The score obtained past times the “borderline” performers serve the Earth to create upward one's hear the cutting score.
Guidelines for setting standards
• Assign an appropriate issue (at to the lowest degree 6-8 for high stakes testing)
• Select the characteristics the grouping should receive e.g. mixed professions
• All judges should attend throughout the session
• The characteristics of the examinees should hold out explained
• Judges should induce got familiarity with attempt out items in addition to format
• Reliability should hold out checked
• Should create the reasonable results
• Acceptable to stakeholders
• pass rates should hold out compared against contemporaneous markers of competence
Method of choice?
There is no perfect measure setting method. The choices may depend on the diverse factors determined past times a item circumstance. Beside, regardless of the usage of measure setting, a attempt out should encompass the appropriate content or should hold out at the appropriate aeroplane of difficulty to create upward one's hear the competency.
References
Bejar I. Standard Setting: What Is It? Why Is It Important? Educational Testing Service 2008
Cizek, G. J. (1993). Reconsidering standards in addition to criteria. Journal of Educational Measurement,
30(2), 93-106
Kaufman DM, Isle of Mann KV, Muijtjens AMM, van der Vleuten CPM. A comparing of standard-setting procedures for an OSCE inwards undergraduate medical education. Acad Med 2000; 75:267-271.
Kramer A, Muijtjens A, Jansen K, Düsman H, Tan L, van der Vleuten C. Comparison of a rational in addition to an empirical measure setting physical care for for an OSCE, Medical Education, 2003 Vol 37 Issue 2, Page 132
Norcini JJ. Setting standards on educational testists. Medical education 2003;37: 464–469

