Of the two biological signals to be investigated in this coordinated project, the electromyographic signals produced by the speech production apparatus (sEMG signals) will be captured and processed.

Using a set of sensors located on part of the face and throat, the signals generated by the movements of the muscles involved in the speech production process will be obtained to generate from them (without the use of acoustic signals) an artificial voice signal. For this, algorithmic techniques based on deep learning will be used.

Although silent speech interfaces can be used in other contexts (such as to maintain the privacy of a telephone or remote conversation), our project is focused on providing voice to people who have undergone a total laryngectomy operation. These people, in general, regain a so-called esophageal speech after a period of intense learning, whose characteristics are notably different from those of healthy speech. Since they still retain control over speech articulators, silent speech data reflecting articulator movements can be captured and converted into artificial speech.

We believe that sEMG-based SSI devices can significantly improve the quality of life for these people.

During the project, databases of EMG and speech signals will be generated that will be made available to the research community. In addition, the use of Deep Neural Networks will be deepened, contributing with new learning architectures. The project will be carried out with the collaboration of international experts in the field of silent speech, and it will collaborate with the association of laryngectomized patients of Bizkaia not only to obtain data, but also, and more importantly, for evaluation and validation of the techniques developed. .

Objectives of the project

Coordinated project (SP1 + SP2)
  • To explore the paths and advances in the application of state-of-the-art deep generative neural network architectures to improve the present quality and intelligibility of current SSIs using EMG and ECoG.
  • To develop corpus, databases, protocols and best practices for research on SSI in Spanish language.
  • To establish a new research line and, consequently, a research infrastructure for SSI in Spain.
  • To strengthen the links between two of the most consolidated research groups on speech technologies at the national level: Aholab at UPV/EHU and SiGMAT at UGR.
Objectives for SP1
  • Establish an infrastructure for the acquisition and processing of EMG signals allowing research on the field of EMG based SSI. This infrastructure includes the necessary electronic sensors, interfaces and computing capacity.
  • Develop a high-quality baseline EMG-based direct speech synthesis system using DNNs, including the necessary databases.
  • Investigate novel architectures to overcome the problem of inter-session and inter-speaker variability.
  • Validate the use of EMG SSI to be used by laryngectomees.

People

Research team of subproject 1

Work team of subproject 1