RESEARCH AREA
Speech Recognition

Automatic speech recognition (ASR) is the process of converting a speech signal, captured by a telephone set or a microphone plugged on a PC, to the linguistic information uttered by the speaker. Voice is the more natural way of interaction and ASR could be used in telephone services, in any human-machine interface to enter commands, in dictation machines, etc. The technology developed at LPTV is based on HMM, is speaker-independent and is able to handle natural language and continuous speech. So far, the R&D efforts are focused in the Spanish language (see demo). A version of LPTV ASR engine supports a cellular service in a Telco.

res
Speech technology for language learning

One of the most important applications of speech technology takes place in the field of language learning. LPTV is currently leading a project on English learning as a second language in Chile. ASR and prosody parameter estimation are investigates as tools to assess pronunciation quality.

res
Speaker Recognition

Speaker recognition is the processes of automatically recognize who is speaking based on the speech signal as biometric information. Speaker recognition is divided into speaker identification (SI) and speaker verification (SV). SI corresponds to associate a recorded speech to one out of N speakers. Consequently, SI is 1:N classification problem. In SV, the idea is to confirm or deny the identity claimed by the speaker. As a result, SV is a 1:1 problem. LPTV has been focused on text-dependent SV (see demo) and a prototype system is currently being evaluated.

Robust Speech Processing

Robustness is one of the main problems faced by ASR and SV systems. Some of the LPTV research results on robustness to additive noise, channel mismatch and coding-decoding distortion have been published in the most important international journals and conference in the field of speech technology.
res
QoS in Internet for real time application

Internet was designed for elastic traffic based on TCP, which in turn can adapt its transmission rate according to the network condition. However, the development of several new real time applications has risen the problem of how to guarantee QoS levels. At LPTV, the problem of real-time protocols has been tackled by applying the LMS algorithm in order to make UDP applications TCP-friendly and to reduce the bandwidth discontinuities.

 

res
Speech Transmission over IP

Speech transmission on Internet is affected by packet-loss and coding-decoding distortion. The problems of ASR accuracy and subjective quality evaluation in IP networks have also been addressed at LPTV.

Usability evaluation of dialogue systems

The concept of usability attempts to measure how well an interface can be used by users to obtain specific objectives with effectiveness, efficiency and satisfaction in a context of a specified use. In LPTV usability evaluation is employed to optimize the design of dialogue systems from the user point of view and to assess the feasibility of a given service supported by ASR or SV.
res
   
   
Speech Processing and Transmission Laboratory