|
|
| Speech
Recognition
Automatic speech recognition (ASR) is the process of converting a speech signal,
captured by a telephone set or a microphone plugged on a PC, to the linguistic
information uttered by the speaker. Voice is the more natural way of interaction
and ASR could be used in telephone services, in any human-machine interface to
enter commands, in dictation machines, etc. The technology developed at LPTV is
based on HMM, is speaker-independent and is able to handle natural language and
continuous speech. So far, the R&D efforts are focused in the Spanish language
(see demo). A version of LPTV ASR engine
supports a cellular service in a Telco.
|
|
Speech technology for language learning
One of the most important applications of speech technology takes place in the field
of language learning. LPTV is currently leading a project on English learning as a second language
in Chile. ASR and prosody parameter estimation are investigates as tools to assess pronunciation quality.
|
|
Speaker recognition is the processes of automatically recognize who is
speaking based on the speech signal as biometric information. Speaker
recognition is divided into speaker identification (SI) and speaker
verification (SV). SI corresponds to associate a recorded speech to
one out of N speakers. Consequently, SI is 1:N classification problem.
In SV, the idea is to confirm or deny the identity claimed by the speaker.
As a result, SV is a 1:1 problem. LPTV has been focused on text-dependent
SV (see demo)
and a prototype system is currently being evaluated.
|
|
Robustness is one of the main problems faced by ASR and SV systems.
Some of the LPTV research results on robustness to additive noise, channel
mismatch and coding-decoding distortion have been published in the most
important international journals and conference in the field of
speech technology.
|
|
QoS
in Internet for real time application
Internet was designed for elastic traffic based on TCP, which in turn can
adapt its transmission rate according to the network condition. However, the
development of several new real time applications has risen the problem of how
to guarantee QoS levels. At LPTV, the problem of real-time protocols has been
tackled by applying the LMS algorithm in order to make UDP applications TCP-friendly
and to reduce the bandwidth discontinuities.
|
|
Speech
Transmission over IP
Speech transmission on Internet is affected by packet-loss and coding-decoding distortion.
The problems of ASR accuracy and subjective quality evaluation in IP networks have also
been addressed at LPTV.
|
|
Usability
evaluation of dialogue systems
The concept of usability attempts to measure how well an interface can be used
by users to obtain specific objectives with effectiveness, efficiency and satisfaction
in a context of a specified use. In LPTV usability evaluation is employed to optimize
the design of dialogue systems from the user point of view and to assess the feasibility
of a given service supported by ASR or SV. |
|
| |
|
| |
|