|
|
A Bridge for the Problem of Deaf Telephony
William Tucker, Meryl Glaser and Jason Penton
Consider making a telephone call to your mother. You pick up the phone, dial
the number and talk to her about the weather, what you had for dinner, etc. Now
imagine that you are one of the 4 million people in South Africa who are hearing
impaired or Deaf. You are Deaf, cannot speak, but use sign language. Your mother
is not Deaf. She can sign, too, but she lives on the other side of town, and you
need to talk to her - now. You cannot simply pick up the phone and call Mom. You
have to get someone that can talk to her on your behalf - someone who can relay
your sign language in speech to Mom over the phone, and translate Mom's reply
from the phone into sign language for you. In order to have a synchronous
exchange with anyone on the Public Switched Telephone Network (PSTN), you must
use a relay. There is an alternative, however - a Deaf Telephone.
In
South Africa, Telkom has built a Deaf Telephone called the Teldem. The Teldem is
a text telephone that converts keypad characters to tones using Baudot encoding,
transmits the tones over a normal PSTN connection, and then converts the tones
on the other end back into characters for display on another Teldem. Even though
the Teldem has several major drawbacks, it is a usable device and almost 800
users in South Africa have them installed. They pay a modest R14/month rental in
addition to the normal call charges. Because a Teldem can only exchange
characters with another Teldem, that user community remains small. You can only
Teldem to Mom if she also has one.
Other alternatives come to mind. What about using the Short Message Service (SMS)
on a cellphone? There are about 9 million cellphones in South Africa, roughly
double the number of South African PSTN landlines. You could SMS to Mom, but how
will you know that she has received your SMS? SMS is fundamentally an
asynchronous technology. Mom could reply via SMS instantly to your SMS, but if
her cellphone is engaged, turned off or the battery is flat, you do not receive
any feedback telling you that is the case. What if it is an emergency? SMS is
just not synchronous - so it is not useful when you have to know that Mom is
actually communicating with you.
You could also use the Internet with a Personal Computer (PC), and use e-mail
or Instant Messaging (IM) instead of SMS. E-mail suffers from the same
asynchronous disadvantages as SMS. IM has promise because you can tell if Mom is
online or not. If she is, the volley of text messages is nearly instantaneous,
despite inherent latency as the text travels across the Internet. But what if,
like most South Africans, neither you nor Mom has a PC at home. In addition,
South Africa still has one of the most expensive Internet access environments in
the world. To make matters worse, because you are Deaf, you are likely to have
very basic literacy skills, not to mention limited Computer Literacy.
There is another way. What if we could build a bridge between a Deaf user's
Deaf Telephone and a normal PSTN or a cellular handset? A collaborative effort
between an Audiologist at the University of Cape Town (UCT) and a pair of
Computer Scientists at the University of the Western Cape (UWC) and Rhodes
University has come up with a very interesting solution. Together, these
researchers have designed and developed a prototype of an automated voice/text
relay that uses the Internet, Text to Speech (TTS) and Speech to Text (STT)
technologies and the PSTN (see diagram below). The prototype is called Telgo323
(Teldem Goes H.323, and here is how it works.

You, the Teldem user, initiate a call to your Mom by dialling into a Teldem
Gateway. The gateway prompts you (via text) for your Mom's phone number and sets
up the call to her over the PSTN via another Internet Protocol (IP)/PSTN gateway
called the Telephone Gateway. Mom's phone rings and she picks up the phone. You
type on the Teldem, and the Teldem Gateway decodes the Teldem's Baudot encoded
character tones into ASCII characters. These characters are buffered until the
Teldem user has finished typing the message, after which the text string is
converted to speech using an open source TTS tool such as Festival.
The speech is then sent to the PSTN user via the Telephone Gateway. As a result
Mom essentially 'hears' your typed text. She replies, and the chain of events is
reversed. The Telephone Gateway transforms Mom's speech into text with an STT
tool, sends the text over the Internet to the Teldem Gateway and finally, that
gateway encodes the text into tones that the Teldem can understand, and you read
what Mom has said.
The Telgo323 idea was sparked from an interesting sequence of events. The
first event involved Meryl Glaser, a Lecturer of Audiology at UCT and William
Tucker, a Lecturer of Computer Science at UWC. Both Glaser and Tucker were
members of a Telkom/Siemens/THRIP
Centre of Excellence based in the Western Cape. Tucker heard Glaser present
a talk on a Deaf community field trial of the Teldem at the South African
Telecommunication Networks and Applications Conference (SATNAC) in 2000. At the
time, Tucker was working with IP telephony and noted the similarities between
Teldem texting and Internet-based chat. They soon began collaborating and at
SATNAC 2001, they delivered a paper
that mapped out a series of deaf telephony bridges, starting with a human relay
and finishing with a fully automated relay. That talk inspired Jason
Penton to build the Telgo323 prototype as part of his Computer Science Master's
research at Rhodes University. His research focus is building H.323
applications. It is interesting to note that Tucker realised that one of
Penton's other prototypes, a system that reads out email over the phone, is
applicable to another disabled community - blind people.
The Telgo323 prototype currently works in only one direction - from you to
Mom. The main reason is that TTS technology actually works pretty well right
now, but STT is another matter. It is tough to train a STT tool over the phone,
and the open source STT tools really do not work very well, even on a PC with a
sophisticated sound card. That is why Telgo323 is designed to "plug and
play" the STT tool so that as the technology improves, we can slot a new
tool into the Telgo323 architecture. The generalised design also allows us to
port the application to various domains. We are currently porting Telgo323 to
the Session Initiation Protocol (SIP) at UWC.
The reason for the SIP port, TelgoSIP, is that we want to open up the
accessibility to a range of voice users on the Internet. Because of the inherent
capabilities of H.323 and SIP gateways (entities that bridge between the PSTN
and Voice over IP worlds (VoIP)), Telgo323 and TelgoSIP will not only bridge
between the Teldem and the PSTN, but also between the Teldem and any IP-situated
"softphone". A softphone is a VoIP-enabled chat tool like Dialpad
or OpenPhone. Terminating a VoIP call is still mostly illegal
in South Africa. See the Department of Communications' website
for more details. In the future, though, we expect VoIP to be legalised, and
also to completely overtake and replace the PSTN as we know it. Therefore, the
Telgo323 architecture not only scales to future advances in STT and TTS
technologies, but it also scales to the ongoing convergence in the
telecommunications industry.
This leads to another fundamental advantage of our Deaf Telephony bridge. In
effect, if we can manage to bridge between a Deaf and a hearing user with this
technology, we can also use it to bridge between text and voice users on the
Internet and the PSTN, regardless of how or where they are connected.
Theoretically, we could bridge between a cellphone SMS sender and a voice user
on the Internet or the PSTN. Likewise, we could bridge a Personal Data Assistant
(PDA) on a wireless Local Area Network (LAN) to an Internet-connected IM user
using voice or text. We could also bridge IM to a voice or text user on a
landline or cellphone (yes, landline phones can already support some degree of
texting). The possibilities are endless, and each of these bridges need to be
built and then tested in an actual user community.
The first bridge to trial is the Deaf Telephony bridge, Telgo323/SIP.
However, we still need to establish and trial the manual bridge (human relay as
described above) as a baseline for several reasons. The human relay call centres
are already in place throughout the developed world, subsidised mostly by
government and the relevant telco. As yet, Telkom is resistant to providing this
service, even on a small scale for research purposes. We feel that STT
technology is currently limited to domain-oriented vocabulary, e.g. weather and
cities, and that Telgo323 requires a general purpose open vocabulary system that
is just not feasible at the moment. Aside from the research aspect, there also
needs to be market take-up. The 4 million hearing impaired/Deaf people in South
Africa will not take up an automated system that does not work well (due to poor
generalised STT) no matter how cheap (or free) it is. Therefore, our approach
would be to utilise a human relay call centre in order to 1) establish a market,
2) assess how manual relay is used in order to incorporate requirements into the
automated Telgo323/SIP system and to 3) use the human relay as a benchmark from
which to measure the automated bridge.
The trial outcomes will obviously feed into the research and development
cycles of the Deaf Telephony Bridge, and the bridges to follow. In the end, the
goals are to find out if Mom and you are really going to use this system or not,
and to learn how to build usable, scalable and billable bridges. The
collaboration continues . . . .
The work has been, and continues to be, partially sponsored by the Telkom/Siemens/THRIP
Centre of Excellence in ATM and Broadband Networks and their Applications at UCT,
the Telkom/DiData/Lucent/THRIP Centre of Excellence in Distributed Multimedia at
Rhodes and the recently launched Telkom/Cisco/THRIP Centre of Excellence in IP
and Internet Computing at UWC. A website with links to all of Telkom's Centres
of Excellence can be found at www.botany.uwc.ac.za/coe
Article by: Bill Tucker (University of the
Western Cape), btucker@uwc.ac.za Meryl
Glaser (University of Cape Town), mglaser@uctgsh1.uct.ac.za
and Jason Penton (Rhodes University) j.penton@cs.ru.ac.za.
|