Automated Essay Evaluation for Faster Formative Feedback

By Nicholas Walker

Many ESL teachers still consider automated essay scoring and feedback to be an impossible dream. But with recent advancements in automated grammar checking technology, and the arrival of other natural language processing tools, the notion of using computers to provide feedback on and score an essay might not be quite so impossible as it once seemed. 

One online grammar checker website now offers to evaluate academic essays, argument essays, and film-analysis essays in just two seconds and for free. Teachers can now begin integrating free automated essay evaluations into their course plans. 

But what should teachers make of automated essay scoring? First, let’s consider the benefits.

Time savings

Teachers spend a lot of time giving formative feedback on student writing,  Often, correction time intrudes into teachers’ evenings and weekends leading to burnout and a shortage of qualified teachers, according to The Observer (Tapper, 2018). 

Consider this simple calculation: 

10 minutes of feedback per essay x 30 students = 5 hours of correction If a computer could score every essay in two seconds without adding to the teacher’s workload, the time saved could be put toward providing higher-order feedback, helping the students in the group who are at risk of failing or just enjoying a correction-free weekend.  

Meaningful practice

Because of the usual impact of writing correction on a teacher’s workload, experienced teachers limit the number of writing assignments they set for their students. If a computer could give effective formative feedback and scoring on the first draft of an essay, the teacher would only need to score their final draft. By reducing the correction load by half, the teacher might be inclined to give a greater number of meaningful writing activities.  Writing practice with a grammar focus is likely to be superior to fill-in-the-blank exercises because it can come with a communicative purpose for a specific target audience.   

Lingering doubts

There are downsides, too. Despite the positive impact automated evaluation would have on writing instruction and teachers’ workload, ESL writing teachers may harbour doubts about the validity of feedback coming from a computer. Computers don’t think about the ideas in the essay they are scoring, they don’t know how much you have improved, and don’t care about how the feedback will make you feel (Miller, 2015). Furthermore, computers have no imagination. They do not construct a model of the world you describe in your essay as they read through it, looking for surface errors and taking measurements. As such, computers can be fooled by clever nonsense and will tend to give lower scores to brilliant non-conformist writing because it is eccentric (Monaghan & Bridgeman, 2005).

Don’t make perfection the enemy of good pedagogy

Nevertheless, the wish for a perfect timesaving solution to formative writing evaluation should not stand in the way of good pedagogy. Consider what the Virtual Writing Tutor’s Film-Analysis evaluation system can do. 

  • Count your words
  • Calculate your average sentence length
  • Check to see if the first sentence is a question 
  • Measure how much background information you have given to film you are discussing
  • Count the number of literary analysis terms you have used in your thesis
  • Measure how debatable of your thesis is to see if it takes a strong stance
  • Check your topic sentences to make sure they are short and to the point
  • Count of cohesion words and phrases
  • Check for quotes and in-text citations
  • Check that you have paraphrased your thesis statement in your conclusion
  • Check for a suggestion and a prediction
  • Check for a “Works Cited” section
  • Score your vocabulary range
  • Check for grammar, punctuation, and spelling errors

And VirtualWritingTutor.com can do all of that in two seconds with detailed colour-coded comments and scores to indicate where revision is needed most. As a source of formative feedback on early drafts, this kind of automated scoring will allow teachers to do what teachers do best: to dramatize the presence of an intelligent human reader and look for evidence of the mastery of the skills being taught in the lessons.

What you may notice about the list above is that essay scoring can now start to provide feedback on achievement. Automated essay rating systems have been primarily focused on measurements of proficiency as a low-cost alternative to hiring additional expert teachers to score entrance exams (Monaghan & Bridgeman, 2005). But classroom teachers don’t focus on indicators of general proficiency in every assessment. For example, when you teach beginners and intermediates, you don’t expect a native speaker’s level of writing just yet. What you want to know is if your student has used the discourse model you taught in class. You also want to know if the student is trying to eliminate errors, to use the vocabulary from the lessons, to link ideas using transition words, and to use evidence to support his or her argument.   

How to access the automated essay writing evaluation

You can begin using the Virtual Writing Tutor’s automated essay evaluation system with your students immediately. Navigate to the Virtual Writing Tutor grammar check website. Click on the “Essay Tests” menu item and select Actively Engaged in Academic Writing.  Click on the “Film Analysis” button and “Start.” 

You will see (1) an accordion section labelled “Film Analysis Essay question,” which describes the assignment in detail. Within the instruction, you can find a link to a sample essay that you can test the system with. Under that is a text area with (2) a “Word Count” button and a timer. Write or paste your text in (3) the text area and click (4) “Finished” to receive feedback and a score. 

How to interpret your score

After you click “Finished,” the system will calculate (5) a score for the entire essay. The total assignment score is calculated by averaging an “Essay structure and content score,” a “Vocabulary” score, a “Scholarship” score, and a “Language accuracy” score. 

Under the assignment score, the system provides measurements of (6) writing quantity and measurements of writing quality in accordion sections. Writing quantity measurements do not affect your score directly. They include word count, sentence count, paragraph count, and a count of the number of questions in your essay. Similarly, writing quality measurements do not affect your score, either. They include your sentence length, sentence length variance, a count of clichés, exclamation marks, and first-person pronouns. For some writing teachers, these are indicators of quality. For other teachers, they are not. 

The (7) “Essay content and structure” score is an average of scores for each of the four paragraphs of the essay. Clicking on each score opens an accordion section with colour-coded comments and suggestions for improvement. For example, the system checks to see if the introduction starts with a question, contains film-related words, and ends with a thesis statement that takes a strong stance on the movie in question. 

Body paragraphs are checked for a strong topic sentence, transition words for cohesion, quotes, and in-text citations.

The system then checks that the thesis has been paraphrased and not repeated in the conclusion. It also checks for a recommendation and a prediction to end the essay.  

Students can provide (8) feedback to the Virtual Writing Tutor’s developers on any problems with the comments or scoring, and students can (9) print a copy of all feedback to bring to class to discuss with their teacher. 

The (10) “Vocabulary” section displays a score based on the number of film-related and literary analysis words in the essay.  The (11) “Scholarship” sections display a score based on a count of the number of works cited at the bottom of the essay. The (12) “Language accuracy” section bases its score on a count of the number of grammar, punctuation, spelling, and capitalization errors. The system tolerates a couple of errors with applying a penalty, as essays with only one or two errors will receive a score of 100% in this section. Each of these sections conceals the details of the score in the collapsible accordion section to keep from overwhelming students with too many details all at once.

The future of automated scoring

Automated scoring of student essays will eventually catch on. However, to win over teachers, automated evaluation systems will have to focus on promoting learning, using their artificial intelligence to give useful formative feedback on early drafts of an essay before the final draft hits the teacher’s desk for a summative score. In the future, the revision process might start to resemble an online writing game in which micro-revisions lead to added points and leveling up. The more revisions, the better. The less burnout among teachers, the better, too.

References

Miller, B. (2015, July 8). Researcher studies teachers’ use of automated essay scoring software. Retrieved December 10, 2019, from Phys.org website: https://phys.org/news/2015-07-teachers-automated-essay-scoring-software.html

Monaghan, W., & Bridgeman, B. (2005). E-rater as a quality control on human scorers. ETS R&D Connections, (2). Retrieved from https://www.ets.org/research/policy_research_reports/publications/periodical/2005/cwyf

Tapper, J. (2018, May 13). Burned out: Why are so many teachers quitting or off sick with stress? The Observer. Retrieved from https://www.theguardian.com/education/2018/may/13/teacher-burnout-shortages-recruitment-problems-budget-cuts