Estudiante: Javier Saldaña
Directora: Eva Navas, Inma Hernáez
Fecha de defensa: Septiembre de 2022
Descripción:
Speech recognition is one of the main fields within Natural Language Processing, and its usage is widespread in different professional domains. Great advancements have taken place in the last decades regarding the development of automatic speech recognizers, for both computing power and algorithms have been greatly enhanced. Notwithstanding, the data availability continues to be an issue when implementing Artificial Intelligent models, for most corpora pertain to the private domain and obtaining data is only possible for a few companies with enough economical resources to afford it.
We support that science should be free and that anyone could develop their own speech recognizers whether they have the required knowledge to do so. Hence, in our project, we aim at evaluating the performance of a well-known open-source recognizer, DeepSpeech, on a large publicly available corpus, Common Voice, for both Spanish and Basque tongues at a general level. In our experiment, we test the model by altering three important parameters, namely the version of the corpus, the integration or disuse of a scorer and the presence or absence of repetitions within the training set. We also carry out a statistical evaluation of the content of our corpora, and we give our opinion regarding the current validation policy of Common Voice corpora.
Keywords: Speech