Personal Challenge Or Easy Way Out?

Can student-designed examinations truly reflect a learner’s level of language acquisition? Bill Hellriegel believes they provide a useful assessment tool.

In an intermediate ESL grammar class taught during Winter Quarter, 2000 in UCSB Extension, International Programs, students were required to create their own items for an in-class test. The results were both surprising and satisfying since the test created was neither too hard nor too easy; and by means of the process students were able to study target structures in greater depth and in a comparative context. Regardless of what we may learn from more rigorous analyses of processes like this one, students’ designing their own test items provides them with a worthwhile learning experience.

Procedure for creating test items
First, students are told why creating their own test items is useful for them: It provides further practice with and analysis of target structures, and, as a key factor in developing effective test items, development of greater understanding of usage differences between similar tenses.

The students are then grouped in threes with at least two nationalities in each group so native language use is minimized. Each group chooses which structure it will focus on and develops a set number of items involving that and related structures. Each group also chooses its item format. This is critical for enabling students to work with target structures in various contexts and formats and thereby develops greater linguistic and cognitive flexibility in understanding and using the structures.

Students are supervised as they develop items and are encouraged to review usage for the structures they are contrasting to assure both that their designated correct choices are correct and that their distractors are at least somewhat believable. The instructor’s chief functions here are to direct students to sources of information regarding usage and to make sure that test items are, normatively speaking, of moderate difficulty. Actual test difficulty is assessed after administration by looking at overall scores and response patterns.

Once items are drafted, groups exchange their sets and correct them for each other. For this step, too, the instructor should provide only guidance concerning sources of usage information so that students are required to review meaning and usage patterns, this time, for structures worked with by other groups.
Finally, the instructor collects all item sets and types up the complete test. The instructor should change some item content (names, subjects and objects, and, if meaning is not altered significantly, verbs) so that students are less likely to recognize items they themselves created; however, item essentials (structures involved, basic utterance structure) should remain the same.

Test results
Ten students cooperated to create the test, and eight took it the following day. The average score was 72.9/100 of what could be expected for a test of medium difficulty taken by a group comprising a range of mastery levels.
The scores clustered as follows:
90-100: 0
80-89: 4
70-79: 1
60-69: 2
0-59: 1
From a total of 24 multiple-choice items,
3 items were missed by 0 students*
8 items by 1 student
2 items by 2 students
4 items by 3 students
3 items by 4 students
2 items by 5 students
1 item by 6 students
0 items by 7, 8, or 9 students, and
1 item by 10 students
*The prompt for one of these items was written incorrectly, so credit was given to all students for the item.

Discussion
That few items were missed by all or almost all of the students, that several were missed by none or by only one student, and that almost half were missed by 2-5 students indicate that the test was neither very easy nor very difficult, but only somewhat easy. Interestingly, there was no pattern of students doing better than their classmates on items they had created although informal comments indicated that they had recognized these items. It seems that if a student does not understand the usage patterns of target structures in an item, he or she will not necessarily answer that item correctly even if he or she created it. Of course, this hypothesis must be tested under more controlled conditions, but it suggests that the beneficial influence of student access to test items before test administration is smaller than we might expect. It also suggests that the chief benefit of students creating their own test items is facilitation of greater understanding of target structures, making the activity very worthwhile.

A final note: An advanced group of students might attempt to develop, for each item, one distractor to be chosen by all but the best students, one to be chosen by only some students, and a third to be chosen only by the poorest students. It is unclear how the students might accomplish this; however, it seems probable that their attempt could provide them with deeper understanding of usage patterns as well as greater metacognitive awareness of their own learning processes and those of their classmates. This would be true regardless of how accurate the students were in assessing the difficulty of the distractors they designed.


Bill Hellriegel, ESL Instructor, University of California, Santa Barbara Extension, International Programs.

Features - Books - Electronic Education - Letters - Editorial - Publish or Perish - Last Laugh