Standard Setting Simplified – Ebel

What is Ebel?

Ebel is a standard setting method which requires exam reviewers to categorise each item in an exam according to its relevance and difficulty, with the calculated score then being compared against a matrix to determine the probability of a borderline candidate getting that question right

Once all questions have been standard set, Ebel (like Angoff) determines a cut-off mark for the exam based on the performance of candidates in relation to a defined standard (absolute), rather than how they perform in relation to their peers (relative). Reviewers make a judgement on individual exam items (test-centred) as opposed to exam candidates (examinee-centred). It is often used in high stakes exams.

How is Ebel Calculated?

Initially, an experienced exam manager will set the matrix determining what proportion of ‘just passing’ students would be expected to get questions right according to how difficult and how relevant the questions are. Different institutions will label the matrix headings slightly differently but typically the matrix will be 3 by 3 with difficulty along the top and relevancy down the side. In the example matrix below, a borderline student would be expected to get an ‘easy’, ‘essential’ question right 70% of the time, while getting a ‘hard’, ‘supplementary’ question right only 20% of the time.

Individual exam reviewers are then asked to categorise each question item according to how difficult it is (i.e. easy, moderate, hard) and how relevant (i.e. essential, important, supplementary). An average of the exam reviewers’ responses is usually then taken to give a final judgement for each item. Once the likelihood of a borderline candidate getting each individual question item right has been determined, then a cut-off mark for the whole exam can be calculated. In the case of Maxexam, all the calculations are done by the system.

What if the experts disagree?

If reviewers’ opinions differ significantly, then they may have a meeting to discuss the results of questions with differing opinions.

At this meeting the reasons for choosing the difficulty and importance will be discussed and a decision made – often taking into account the experience of the exam reviewers for that subject e.g. if they taught the subject being tested or know the curriculum better.

Backing up your Ebel

In order to get the most accurate cut-off mark for an exam, it is preferable to have as many exam reviewers as possible. In addition to the judges’ labelling of difficulty for an item, it is good practise to take a sample of past marks and candidate’s expected results to reinforce this method.

Another way to confirm Ebel is working well in an institution is to use another standard setting method, such as borderline regression post exam, to provide results based on real candidate data for comparison. If the results don’t reflect the standard that would be expected of the students taking the exam, the standard setting method can be re-evaluated.

Should you use Ebel?

Ebel is a well-established method of standard setting and is widely used. It’s most often used in high stakes written exams such as MCQ’s due to the costs of implementation, and is most reliable when supported by another standard setting method.

In order to ensure Ebel is working well for you it is vital to make sure that the matrix is right and this is sometimes tricky. Please see our blog ‘The Problem with Ebel’ for more information about this.

Advantages of using Ebel

It is easier for exam reviewers – Standard setters are likely to find it simpler to make a judgement about how difficult or relevant a question item is (e.g. moderately difficult and essential) than what the probability is that a borderline student would get that particular question right, which is what they are required to do for Angoff.

Provides an overview of exam difficulty – Once standard setting is completed it provides a good summary of how all the questions have been classified – so it is easy to see whether an exam has a high proportion of difficult questions for example.

Pass mark is determined prior to the exam – you can see both what the pass mark will be and the likelihood of a just passing student passing the exam prior to it being sat.

Holds up in court – Like Angoff, Ebel is the widely used, and if questioned the Ebel method would hold up in court.

Disadvantages of using Ebel

Relatively time consuming and costly – as numerous standard setters are required to determine the difficulty and relevance of every single question item.

Requires back-up – As it doesn’t use real exam data to determine the cut-off mark it is considered more accurate and reliable if backed up by a criterion-referenced method e.g. borderline regression.

Matrix difficult to get right – It is difficult to get the probabilities in the matrix spot on, even though this would be done by an expert in the field. This should not be underestimated and in fact we have written a blog explaining this issue here.

Requires digital software – While we wouldn’t see this as a disadvantage some might! Due to the high number of judges required to establish a reliable result, without digital software gathering all the responses and establishing the cut off mark it would be incredibly time consuming and laborious.

In summary, Ebel is a standard setting method that requires subject experts to make judgements about how difficult and relevant each item in an exam is. Those responses are then transposed onto a pre-determined matrix and the probability of a borderline candidate getting each question right, and an overall pass mark can be calculated. It is widely used in high stakes exams and holds up in court if challenged.