Adaptation approaches for pronunciation scoring with sparse training data

Mostrar todas las versiones(2)

In Computer Assisted Language Learning systems, pronunciation scoring consists in providing a score grading the overall pronunciation quality of the speech uttered by a student. In this work, a log-likelihood ratio obtained with respect to two automatic speech recognition (ASR) models was used as sc...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Landini, F., Ferrer, L., Franco, H., Karpov A., Mporas I., Potapova R., ASM Solutions Ltd.
Formato:	SER
Materias:	Computer-assisted language learning Log-likelihood ratio MAP adaptation Pronunciation scoring Computer aided instruction E-learning Grading Linguistics Automatic speech recognition Computer assisted language learning Computer assisted language learning systems Log likelihood ratio Pronunciation quality Scoring performance Speech recognition
Acceso en línea:	http://hdl.handle.net/20.500.12110/paper_03029743_v10458LNAI_n_p87_Landini
Aporte de:	Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA) de Universidad de Buenos Aires

Descripción
Sumario:	In Computer Assisted Language Learning systems, pronunciation scoring consists in providing a score grading the overall pronunciation quality of the speech uttered by a student. In this work, a log-likelihood ratio obtained with respect to two automatic speech recognition (ASR) models was used as score. One model represents native pronunciation while the other one captures non-native pronunciation. Different approaches to obtain each model and different amounts of training data were analyzed. The best results were obtained training an ASR system using a separate large corpus without pronunciation quality annotations and then adapting it to the native and non-native data, sequentially. Nevertheless, when models are trained directly on the native and non-native data, pronunciation scoring performance is similar. This is a surprising result considering that word error rates for these models are significantly worse, indicating that ASR performance is not a good predictor of pronunciation scoring performance on this system. © Springer International Publishing AG 2017.

Adaptation approaches for pronunciation scoring with sparse training data

Ejemplares similares