Science fiction novels and films have always been a source of inspiration to those of us who work in technology development. The computer HAL in 2001: A Space Odyssey is a classic reference to what are known as speech and language technologies, which are developed to facilitate and humanize communication between people and machines.
Today, we no longer need to turn to science fiction to explain the opportunities offered by these technologies. Applications as popular as Siri, the personal assistant produced by Apple, use speech and language technologies to communicate orally with users, reply to their questions, give recommendations and execute actions. However, the aim is to go even further and to devise machines that understand not only words, but also all the additional information that can be derived from a voice: who is speaking, their age, personality, accent, state of mind, satisfaction with the service, etc., and then use this information to respond in the most appropriate way, for example with the correct level of formality in the language or emotion in the voice.
TALP UPC (Center for Language and Speech Technologies and Applications, a member of CIT UPC) has over 25 years’ experience in the development of speech and language technologies, and in collaborating with companies to convert these technologies into solutions.
In applications such as the automation and improvement of call centres, speech technology can be used to understand and respond to clients’ most common questions, such as checking the balance of a bank account. However, even when the communication is between people, speech and language technologies enable us not only to transcribe the conversation, but also to analyse it to obtain a summary or to assess, for example, the client’s level of satisfaction or dissatisfaction according to his/her tone of voice.
Subtitling of films or television programmes is another of the real applications of technology developed at TALP. Television channels such as TV3 subtitle dozens of programmes every day, some of which are broadcast live, and the use of speech technology is essential to continue to provide quality subtitles at a lower cost.
Another important application of these technologies is machine translation of text and voice. TALP UPC has taken part in various European R+D projects on machine translation and regularly participates in competitive assessments of the quality of the systems that have been developed, with good results.
Looking ahead, cinema has again provided a reference in the recent film â€œHerâ€, in which the talking operating system can understand and transmit emotions. To improve results and provide innovative solutions in all fields, speech recognition systems in the coming years must go beyond words and understand as much paralinguistic information as possible. The analysis of feelings and emotions in speech and language is a growing research area, as is the generation of more expressive voice and text.
Another related application that is already beginning to be commonly used by companies is the automated analysis of opinions. This natural language processing technology has become one of the best tools for extracting information from the millions of messages exchanged daily in social networks and the media, in order to monitor a company’s presence, image and reputation.
PhD. José Adrián Rodríguez Fonollosa
Researcher in theÂ Language and Speech Technologies and Applications Center (TALP UPC) and
winner of the 1st Prize GE Flight Quest 2 de General Electric (2014)