The essential guide to item analysis

Want to know what your exam data is really telling you and how to know which questions are performing well or badly? A lot of time and effort is spent filling your question bank with questions. Understanding how your questions are performing is just as important as creating them. Here are two questions you need to ask to be able to analyse the performance of your questions without having to have a PhD in psychometrics.

Is it an easy or difficult question?

How difficult or easy an item is is determined by the overall performance of candidates who answered that item. You can quickly find out how difficult an item is by looking at the average mark for the item. The term ‘average mark’ can have many names and you might hear people throwing around alternative names such as the p-value, difficulty index or easiness index. We’re not going to bog you down with jargon, but these are essentially the same thing.

Simply, the higher the average mark the easier the item. When the average mark is displayed as a percentage, 0% means that no one got the question right hence it is an extremely difficult question and an average mark of 100% shows that everyone answered the question correctly (an easy question).

So how easy or difficult should a question be? There are no strict rules, but generally you want an assessment with items that have different average marks ranging between 20 – 80%. Any item which has an average mark below 20% or above 80% should be looked at more closely and revised.

Just because a question is relatively easy doesn’t mean that it should be removed from an exam. It might be that the question is essential information and should be known. However you wouldn’t want too many easy questions.

Once you have an idea of how difficult an item is the next question to ask yourself is:

Does the question distinguish between top and bottom performers?

The difficulty level of an item should be considered alongside what is called the discrimination index.

What is the discrimination index? It tells you how well an item differentiates between students who did well in the test overall (top performers) and those that do not (bottom performers). Ideally you want an item where the top performers answer the question correctly more often than the bottom performers so that the item distinguishes between these two groups. The discrimination index is given on a scale from -1 to 1. As a general rule the more positive the discrimination index the better the quality of the question.

Below are some guidelines for interpreting what a given discrimination index value means:

The final trick

Now you understand how to tell how difficult a question is and whether it can distinguish between top and bottom performers, the final trick is to consider the two together. We will look at a few examples.

An item with an average mark of 25% (a difficult question) and a negative discrimination index (bottom group outperforming the top group) could indicate that the topic that the question covers has not been taught or that the way the question has been written makes it ambiguous. This question needs to be rewritten before it can be used in another assessment.

This item is moderate (55%) and has a very good discrimination index (0.6). This is a good question as it distinguishes your top performers from your bottom performers. It could be an indicator of weaknesses in your bottom group’s knowledge. This item could be used in another exam without revising.

An item with a discrimination index of 0 means that this question does not discriminate at all between bottom and top performers. However as this is an easy question (85%) it is not unexpected that the question isn’t discriminating as you would expect that with an easy question both top and bottom groups will be answering the question correctly. If this question was intended to test basic knowledge then it is okay, however if it is meant to be testing more complex skills such as analysis and application of knowledge then it would require reviewing.

Asking the right questions

Asking these two simple questions will start you off on the right step to understanding the performance of your questions. This will enable you to know which questions you can leave alone and which questions might need to be reviewed before they can be reused in another exam. Statistics produced after an exam has taken place should never be looked at in isolation. Item stats should be viewed alongside the context of the question to make sure that the question has the intended effect it was written for.

Taking the time to analyse the performance of items will not only help improve the quality of future exams but enhance your skills in writing questions and identifying areas of a course where the content may need to be adapted to take into account of students misconceptions.