« Privacy Preserving Speech Processing »

L’équipe ARMEDIA organise un séminaire sur le traitement automatique de la parole, ouvert à tout le labo,

le jeudi 18 mai, à 14h (H218, dans le batiment Etoile de Télécom SudParis, avec le lien visioconférence avec le site de NanoInnov à Saclay).

Titre : « Privacy Preserving Speech Processing »

Intervenant : M. Gérard Chollet, chercheur émérite CNRS et nouveau membre SAMOVAR (ARMEDIA) depuis décembre dernier

Biographie : The education of Dr. Gérard Chollet prior to the doctoral level, was centered on Mathematics (DUES-MP), Physics (Maîtrise), Engineering and Computer Sciences (DEA). He studied Linguistics, Electrical Engineering and Computer Science at the University of California, Santa Barbara where he was granted a PhD in Computer Science and Linguistics.
He taught courses in Phonetics, Speech processing and Psycholinguistic in the Speech and Hearing department at Memphis State University in 1976-1977. Then, he had a dual affiliation with the Computer Science and Speech departments at the University of Florida in 1977-1978.
He joined CNRS (the French public research agency) in 1978 at the Institut de Phonétique in Aix en Provence. Since then, CNRS is his main employer. In 1983, he joined a newly created CNRS research unit LTCI (Laboratoire de Traitement et Communication de l’Information) at ENST (currently Telecom-ParisTech). Dr. Gérard Chollet was head of the speech group before he left temporarily for IDIAP (a research laboratory of the `Fondation Dalle Molle’ in Martigny, Switzerland), in 1992.
In 1985, he spent a sabbatical year at IPO in Eindhoven, The Netherlands. This is where he developed the `temporal decomposition’ technique with Steven Marcus. That technique has been quite successful and is still under development at AT&T, CUED, ICP, LAFORIA, IMT, Intelligent Voice, etc.
From 1996 to 2012, he was full time at ENST, managing research projects and supervising doctoral work. CNRS decided in July 2012 to grant him an emeritus status. He accepted a Visiting Professor position in Boise State University where he taught in 2012-2013. He was a Nokia Research Fellow at the University of Eastern Finland, Joensuu in 2014. He is now Vice President for research at Intelligent Voice, London, UK. In 2016 he joined the CNRS research unit SAMOVAR at Telecom SudParis.
Dr. Gérard Chollet teaches graduate courses every year (in Paris, Lausanne, and Boise) in Speech, Signal Processing and HCI. He supervise(d) more than 40 doctoral thesis.
His main research interests are in phonetics, automatic audio-visual speech processing, spoken dialog systems, multimedia, pattern recognition, biometrics, privacy-preserving cloud computing, digital signal and image processing, speech pathology, speech training aids, etc. He operates as a private consultant for companies and laboratories on projects related to his expertise.

Résumé :
As computational and communications infrastructure expands in its capabilities, so has the resultant exposure of its users to unintended and undesired consequences. This is particularly so for voice-based services and communication. Increasing numbers of people are using voice-based services for a variety of purposes. Large amounts of private voice data are being stored on cloud platforms. However, in each of these actions the user unwittingly gives away highly private data – their voice. Voice is a legally accepted biometric. A person’s voice contains information about their gender, origins, health, emotional state, age, … In using any service, the user is giving away not only the content of their speech, but also this information. A malicious server, or an eavesdropper, may obtain unintended demographic information about the user by analyzing the voice and sell this information. It may edit recordings to create fake recordings the user never spoke. Merely encrypting the data for transmission or storage does not protect the user, since the recipient (the server) must finally have access to the data in the clear (i.e. decrypted form) in order to perform its processing.
In this tutorial, we will discuss solutions for privacy-preserving sound processing, which enable a user to employ sound- or voice-processing services without exposing themselves to risks such as the above. We will describe the basics of privacy-preserving techniques for data processing, including homomorphic encryption, oblivious transfer, secret sharing, and secure-multiparty computation.
We will describe how these can be employed to build secure « primitives » for computation, that enable users to perform basic steps of computation without revealing information. We will describe the privacy issues with respect to these operations. We will then briefly present schemes that employ these techniques for privacy-preserving signal processing and biometrics. We will then delve into uses for voice processing, including authentication, classification and recognition, and discuss computational and accuracy issues.
Finally we will close with a discussion of the current state of the art, future directions, and avenues for legal and scientific research.