Examining Automated Caption Accuracy

Project Overview

Captions are an essential part of using video for teaching and learning. While Automatic Speech Recognition (ASR) has made significant advances recently, there are still issues with accuracy and context in transcripts generated by ASR, particularly in fields with specific terminology.

Human correction of ASR transcripts or human transcription is the way to achieve 100% accuracy, but comes at a monetary and time cost that is not feasible for most teachers or learning institutions.

This study seeks to understand the current state of widely-available ASR transcript generation tools by analyzing transcripts generated by these systems with known 100% accurate transcripts. An additional goal of the study is to identify possible user decisions/practices or equipment choices/use that affect ASR transcript quality for future study.

Project Team: Scott Schopieray, Betsy Sneller, Kate Sonka, Daniel Trego

Scott Schopieray

Dr. Scott Schopieray is the Assistant Dean for Academic and Research Technology in the College of Arts & Letters at Michigan State University. He is a core team member of the Enhanced Digital Learning Initiative (EDLI) where he focuses on institutional strategy, motivation to teach with technology, and technological structures to support digital teaching and learning. Dr. Schopieray is also Associate Director of MESH Research, a center focusing on the future of digital scholarly publishing.