Student: Aitor Valdivielso González
Supervisors: I. Hernáez, D. Erro
Abstract:
Audio de-identification is described as the technique capable of transforming an input speech uttered by a speaker A into a voice signal that, while preserving the message information, seems to be uttered by a speaker B. Within this context the main goal of this Master’s Thesis is to present a novel approach for reversible audio-de-identification which will improve the performance of the existing algorithms in several aspects.
The first stage of the algorithm corresponds to the analysis of the input signal and the extraction of the parameters that represent the input audio signal by using a Harmonic Model for its characterization. The next step is to modify those parameters by using an effective criteria so that the resynthesized audio file sounds as uttered by a different speaker. Since the system is required to be reversible, the information describing the signal parameters modification criteria will be embedded into the signal both as metadata and as watermarked information. This information will be encrypted so that only the person/computer who handles the right decryption algorithm is able to revert the process and recover the original audio. The receiver on the other side of the communications link, with the right tools, will be able to first extract the hidden information and then synthesize the original audio with the right parameters for further analysis. The program accomplishes several tasks. First of all, the original audio is totally de-identified so that neither a potential listener nor a Speaker Identification software (SID) are capable of identifying the source speaker. In addition, the intelligibility of the message in both the de-identified audio and the recovered one is assured. The voice of the de-identified audio track keeps sounding natural so as not to be interpreted as suspicious by an attacking listener. Finally, the recovered voice is as similar as the technique allows to the original input signal.
Those features have been used to evaluate the performance of the algorithm and rate its suitability.
Once completed, the algorithm offers numerous tools that will be suitable for their utilization in applications within the scope of research of this Master’s Degree. The main service targeted by this algorithm is satellite phone communications. This system will provide satcom communications an extra dose of security in this era of global criminal hacking and privacy expropriation. These communications are powered by GEO and LEO satellites such as Thuraya, SkyTerra, Terrastar or Iridium. There are more than 25.000 satphone receivers distributed around the world and thus, they constitute a really important community. Furthermore, there is an increasing use of these devices after catastrophes and terrorist attacks since 9/11.