Here’s how the method of testing can change student scores

Steve Graham, Arizona State University

Students who recently took the Partnership for Assessment of Readiness for College and Careers (PARCC) scored lower when they took the test on a computer than when they used paper and pencil.

This might not matter much if the results of these tests played a minimal role. But they do not. Test scores are used for accountability purposes at the federal, state and local level. In some states, test scores play a role in student graduation and the evaluation of teachers and principals.

The question is, does the method of test taking actually influence test results?

I have been researching factors that influence test performance when students write essays. Such essays are written with paper or pencil or on a computer. Based on research that I coauthored in 2011, the answer to this question is yes. But there are several caveats.

In contrast to the findings from the PARCC test, we found that students writing on a computer scored higher than students writing with paper and pencil. This finding did not apply to all students, though. Students with little experience using a computer to write had higher scores when writing by hand.

Computer versus pencil-and-paper tests

In the last five years, two partnerships of U.S. states funded by the federal program Race to the Top were tasked with developing assessments for determining if students were on track or ready for college and the world of work.

The consortia developed computer-based assessments that, among other things, would make scoring easier, sharing results faster and conducting assessments cheaper. Many, but not all states, agreed to use these tests to assess students’ academic progress in multiple grades across the school years.

For tests developed by one of the consortia, Partnership for Assessment of Readiness for College and Careers (PARCC), students obtained higher scores in English/language arts on the paper pencil version versus the computer one.

By contrast, I obtained very different results in my review of seven scientific studies of factors that influence test results. Students’ writing performance on computer assessments was 21 percentile points higher when compared to students who wrote via paper and pencil.

But then, another review I conducted of 18 scientific studies found the same 21 percentile advantage for writing when students used computer for writing in the classroom.

Computer-based assessments

So why are there differences between PARCC tests results and the finding from scientific studies I reviewed? A likely explanation involves students’ experience with the method of testing.

Computers can underestimate writing achievements.
Samuel Mann, CC BY

My review of four scientific studies showed that students with little experience using computers as an assessment tool scored 18 percentile points lower than when they composed their essays using paper and pencil.

In other words, a student’s mastery of the method of testing matters. For students with little experience, computer assessments underestimate their writing achievement.

To get a sense of how method of testing can influence writing performance, imagine you are asked to write something for a test using a Chinese typewriter. This is a very complex writing tool designed to create 6,000 characters. Top typing speeds are 11 characters per minute.

Even if you reach this benchmark, you will have no hope of typing fast enough to get all your thoughts down on paper before some of your ideas slip from memory. If you are not proficient with this typewriter, then the problem is even worse. As you hunt for the next character, your memory is taxed even further, resulting in even more ideas being lost.

As this example illustrates the method of testing can interfere with a students’ performance. If a student is not adequately familiar with the testing tool or it is cumbersome to use, time and energy most be devoted to using it.

This is time and energy that can profitably be devoted to answering test questions.

Pencil-and-paper assessments

These kinds of problems are not limited to tests taken on a typewriter or computer, they can occur for paper-and-pencil tests too.

Students handwriting is not always fast enough for them to record all of their ideas before some of them slip from memory. This is a problem even for college students.

In a study with University of London undergraduates, handwriting fluency accounted for 40 percent of the variance in their scores on a timed-essay writing test.

Legibility of response can influence results on a pen-and-paper test.
Dennis S. Hurd, CC BY-NC-ND

With paper-and-pencil tests, there is an additional complicating factor. Scores on handwritten tests can be influenced by the legibility of the response. Test responses that are less legible can drop scores by 35 percentile points compared to the same response that is written neatly and legibly.

Making the matter even more complicated, a typed paper is scored more harshly than the same handwritten paper.

In a review of five scientific studies, I found that the score for a typed version of a handwritten text dropped by 18 percentile points. According to teachers involved in these studies, spelling and grammar errors were more visible in typed versus the handwritten version of the same paper.

So, method of testing makes a difference in the following ways: if students are not adept at taking a test on a computer, they score higher on the same paper-and-pencil test. If they are adept with a computer, they score higher on the computer test. Students performance is further moderated by handwriting fluency and legibility on paper-and-pencil tests and the number of spelling and grammar errors on computer tests.

Why use digital tools

What testing methods should schools use? Should computer-based assessments be abandoned, in view of recent PARCC results?

In the best of all possible worlds, students should be allowed to use the method of testing they are most proficient with when taking tests. However, this is unlikely to happen as it adds another level of complexity and costs to test taking. So, one alternative for groups like PARCC is to statistically adjust scores to reflect the differences between test taking modes.

Abandoning computer-based tests would be a mistake. These assessments have the potential to move schools from 19th-century writing tools to 21st-century tools.

As high-stakes assessments go increasingly digital, schools will make word processing and other digital composing tools a common staple. Studies have shown that students who use such tools over time become better writers than those who continue to write with paper and pencil.

At the end of the day, testing must produce something positive. Better writing tools in the classroom would be a step in the right direction.

The Conversation

Steve Graham, Professor of Leadership and Innovation, Arizona State University

This article was originally published on The Conversation. Read the original article.

Choose your Reaction!