Efficient Use of Voice as a Channel for Delivering Public Services
Kapil Kant Kamal
1
, Manish Kumar
1
, Bharat Varyani
1
and Kavita Bhatia
2
1
Centre for Development of Advanced Computing, Mumbai, India
2
Department of Electronics and Information Technology, Delhi, India
Keywords: Automated Call Distributor Automated Speech Recognition, Computer Telephone Integration, Interactive
Voice Response System, Text to Speech.
Abstract: Delivering the information and services to the citizen is a key task of Government. It is the responsibility of
the government to keep their citizens informed and deliver public services to them on timely basis. This
information required for making critical decisions and forming any opinion. For good governance and
transparency, it is very essential that the services and information is delivered timely. Delivering
information and services through conventional methods like paper forms, e-Forms have problems in
countries having large section of population illiterate. So, more efficient methods need to be employed for
the information sharing and data capture. With live human interaction and local language support, an
Interactive Voice Response Systems (IVRS) can be an effective method through which data can be captured
and information about the services can be shared even to the illiterate population. This paper discusses the
issues involved in the implementation of IVR system and making voice as a channel in delivering services
to the citizen. This paper is based on the investigation done for finding the potential of an IVRS services and
it also discusses the real time IVRS requirements for successful implementation of Govt projects and how
IVR systems will increase the acceptability, reduces the query-time of citizen and for making public
delivery systems more efficient. We propose a nationwide single number for accessing all Govt. services on
user local language. Further, it also includes the case study of Department of Agriculture & Cooperation,
Ministry of Agriculture, depicting how IVR system has helped farmers. Such IVRS may be replicated by
other Govt. department wherever necessary at customer ease.
1 INTRODUCTION
Countries around the world are making full
utilization of the ICT tools to deliver the government
services electronically and have started offering
transactional services. In many developing countries,
governments are facing difficulty in delivering
public services in rural areas due to lack of literacy.
The oldest and most natural means of information
exchange between human beings is voice and with
recent advancements in the technologies, automated
processes and system has made voice channel to be
strong enough for reaching out to the citizens. Voice
has some advantages over the conventional methods
of information capturing and sharing. An IVR
system with other support systems such Automated
Speech Recognition (ASR), Text to Speech (TTS)
can be employed to enable Voice as a new channel
of delivering public services. The literacy rate in the
rural sections in developing countries is still
comparatively very low compared to the urban
population and given deep penetration of the mobile
subscription, in the rural section; the voice based
delivery of services can be very effective. With the
support of multiple languages voice can be of great
means in countries like India where over 20
languages are spoken. Most of the world’s 3.6
billion mobile subscribers (
Anne Bouverot, 2012)
from the developing nations use their mobile phones
primarily for calls. The IVR services can be used in
diverse domains, including news and information
feeding to citizen, discussion on agricultural (like
information about market, weather and crop
advisory agents on call, expert system of
recommendation for fertilizer), community dialogue
(
Agarwal 2009) access to health information
(
Sherwani et al., 2007) group voice calling for
information distribution over the citizens of a large
geographical region at once, feedback on school
meals (Mishra, 2010) (
Grover, 2012), etc.
In this paper, we have outlined the flow of
626
Kamal K., Kumar M., Varyani B. and Bhatia K..
Efficient Use of Voice as a Channel for Delivering Public Services.
DOI: 10.5220/0005375806260631
In Proceedings of the 17th International Conference on Enterprise Information Systems (ICEIS-2015), pages 626-631
ISBN: 978-989-758-097-0
Copyright
c
2015 SCITEPRESS (Science and Technology Publications, Lda.)
technical approaches for creating a scalable IVRS
platform for delivering the public services anytime
anywhere. We have also introduced the combined
use of different voice architecture which can be used
for creating effective voice platform. Compared to
prior solutions, IVR platform offers two key
novelties. First, it seamlessly connects Internet based
users with phone-based users. Both sets of users can
contribute and retrieve audio messages from a
repository in the automated IVR system.
Departments can connect to IVRS through internet
to post audio recordings for automatic broadcast to
mobile phones. The second uniqueness of IVR
System is that it scales across geographically
distributed access points, enabling affordable access
via local phone calls (
Vashistha, 2012).
2 TECHNOLOGIES USED FOR
AUTOMATED AND
INTELLIGENT IVR SYSTEM
IVR systems may become, primarily, an assistive
device for callers and agents during a conversation.
IVR will support in making a conversation more
meaningful by collecting and conveying information
to one or both the parties. In that sense, IVR will be
a thin intermediate layer that can amplify the impact
of talk by making it more interactive, and by
providing context. Some of the technologies used in
enhancing IVR systems are listed below.
2.1 Text to Speech (TTS) Systems
The goal of TTS is to convert input text to natural
sounding speech to transmit information from a
machine to a person, for example, citizen dials an
IVRS number to check the status of his/her
application he / she had been filed, and the IVRS
reads out the status fetched from the concerned
department server by converting text received into
speech using TTS engine. Such systems string the
words together to be spoken in isolation and the
artefacts of such a scheme are being often
perceptible. The methodology used in TTS is to
exploit audio representations of speech for
synthesis, together with linguistic analyses of text to
extract correct pronunciations (what is being said in
given context in terms of region, language) and
prosody in context (‘‘melody’’ of a sentence; how it
is being said). Synthesis systems are commonly
evaluated in terms of three characteristics: accuracy
of rendering the input text (does the TTS system
pronounce, e.g., acronyms, names, URLs, email
addresses, a knowledgeable human would?),
intelligibility of the resulting voice message
(measured as a percentage of a test set that is
understood), and perceived naturalness of the
resulting speech (does the TTS sound like a
recording of a live human?). Text to Speech system
can be used to broadcast citizen services like
weather information, crop details, etc. to farmers,
status updates, etc. in addition to banking services,
telecom services (
Richard,2006).
2.2 Automatic Speech Recognition
Automatic Speech recognition which means
understanding voice input and performing any
required task or the ability to match the voice input
against a provided or acquired vocabulary. The task
is to get a computer to understand the spoken
language. By “understand” we mean to react
appropriately and convert the input speech into
another medium e.g. text. Speech recognition is
therefore sometimes referred to as speech-to-text
(STT). The Automatic Speech Recognition system is
very important in delivering government services as
there are hundreds of services and it is extremely
difficult to access these services through a common
number without an accurate ASR system.
2.3 IP-Telephony
With the introduction of new edge technologies, the
Internet Protocol (IP) based networks are
increasingly being used as an alternative to the
traditional circuit-switched telephone network. The
different flavours of IP Telephony provide varying
degrees, alternative means of originating,
transmitting, and terminating voice and data
transmissions which would otherwise be carried by
the public switched telephone network (PSTN)
(Craig, 2000).
2.3.1 IP based Audio and Video Calling
Audio and Video calling can be done over IP
network. Through the use of Session Initiation
Protocol (SIP) the point-to-point communications
are no longer restricted to voice calls but can now be
extended to multimedia technologies such as video.
The IVR systems with live video of the caller
provide the ability to have true value interaction
with the caller. With the introduction of full-
duplex video, IVR will allow systems such as the
ability to read emotions and facial expressions. This
EfficientUseofVoiceasaChannelforDeliveringPublicServices
627
video calling can be the future of remote biometrics
detection such as IRIS scan or
other biometric means. Recordings of the caller may
be stored to monitor certain transactions, and can be
used to reduce identity fraud (lyle-kenya.com).
2.3.2 Unified Communications in the SIP
Contact Centre
With the introduction of SIP contact centres
(automated menu driven SIP systems), traditional
barriers to automation are breaking down. As calls
are queued in the SIP contact centre, the IVR system
can provide treatment or automation, wait for a fixed
period, or play music. Inbound calls to a SIP contact
centre must be queued or terminated against a SIP
end point; SIP IVR systems can be used to replace
agents directly by the use of applications deployed
using BBUA (Back to Back User Agents).
2.4 Automatic Call Distributor (ACD)
In telephony, an automatic call distributor (ACD)
or automated call distribution system is a device or
system that distributes incoming calls to a specific
group of terminals that has agents based on
customer's need, type, and agent skill set. It is often
being a part of computer telephony integration (CTI)
system. An automatic call distributor (ACD) is often
the first point of contact when calling many larger
businesses. An ACD uses digital storage devices to
play greetings or announcements, but typically
routes a caller without prompting for input. An IVR
can play announcements and request an input from
the caller (lyle-kenya.com).
3 SERVICES THROUGH IVRS
In nations like India, where there are 22 different
official languages and around 25% of population is
still illiterate (en.wikipedia.org), development of
IVR System can be of powerful medium for
delivering public services to citizens.
IVRS application can be used to offer Citizen to
Government (C2G) and Government to Citizen
(G2C) services his / her local language. There are
various government services which are compelling
and receiving a lot of enquiries from the citizens.
Keeping the magnitude of the population in mind, it
is not surprising that these services draw a huge
volume of enquiries. These enquiries are nothing but
an overhead for the government and such processes
if automated can reduce the undue overheads.
3.1 Inbound Interactive Voice
Response
In an Inbound IVRS service, the citizen can call on
the interactive voice response system. The IVR
system has predefined menu for users which have
the introduction of service and the information
gathering menu as well.
IVRS can be accessed as an auto receptionist to
attend the call of your customers. It may guide them
to the desired department or to the desired person, or
may register or respond to their query and
complaints (ivrsdevelopment.com).
Menu Options:
Messages need to be kept short, and should
include some prominent key words
The function need to be announced followed by
the key required to activate it
Provision to the customers for two or three
chances to select an option
The system should transfer a caller to an
operator if no option is chosen
Provision for repeat facility, keeping the best
practice for the repeat to occur automatically
rather than relying on the customer selecting to
hear the options again.
For example, IVRS can be used in the healthcare
sector like hospitals, physicians, nursing homes,
diagnostic laboratories, pharmacies, medical device
manufacturers and other components for efficient
workflow. Broadly healthcare sector implement
IVRS application for the following workflow
(ivrsdevelopment.com) .
IVRS Auto Attendant
Patient Information using IVRS
Patient and Other Records Management
Figure 1: Example Flow of Inbound IVR System.
Figure 1 shows the example flow of the Inbound
IVR system. Citizen dials a predefined number for
accessing the service that he/she wants to avail.
Network signal reaches through PSTN network to IP
ICEIS2015-17thInternationalConferenceonEnterpriseInformationSystems
628
network and then reaches to IP calling stack server.
This IP calling stack server connected with the
different service providers. By identifying DTMF
(Dual Tone Multiple Frequency) signal calling
server provide the appropriate response to user.
3.2 Outbound Call Notification Voice
Response
Outbound IVR system is used for generating call for
notification and advisory to the users. It is more
effective in rural areas where illiteracy is the major
problem message with use of local regional language
making voice IVR channel as a strong
communication mode. Use of TTS at WEB
application end makes the use of the services more
flexible with cross language Text to Speech message
generation and transmission.
Figure 2: Example Flow of Outbound IVR System.
Figure 2 shows the flow of an IVR system for
generating outbound call. A user can generate the
single or multiple calls through application server
which is connected to IP network. A network
interface card converts this IP signal to telecom
signal and transfer it to PSTN and a call rings on a
requested number.
3.3 IVR Systems with Human Agents
The IVR systems with human interaction can add
more value for citizen satisfaction. The Government-
citizen relationships where the hardships
experienced in browsing through IVRS menu by
citizens are done away with the introduction of
human agents. Call centre with the local language
agent can make a call centre available and reachable
to the local citizen. User can call over PSTN
network or over IP Network to IP Calling stack
server which can convert the telephony signalling to
IP signalling.
Figure 3 shows the example flow of call centre
with human agents. Citizen dials a predefined
number, network signal reaches through PSTN
network to IP network and then reaches to IP calling
Figure 3: Example flow of Call Centre with human agents.
stack server. This IP calling stack server connected
with the different agents directly or with automated
call distribution system. By identifying DTMF (Dual
Tone Multiple Frequency) signal calling server
provide the rings to appropriate and available agent.
Examples of some IVR Systems:
A. IVR system to inform mothers during the
pregnancy period about their day care health
precautions, about the vaccination of their new born
babies, about the upcoming natural diseases and
their cure etc.
B. Campaigning for social causes like polio
vaccinations, weather forecasting, and disaster
management can be done through IVRS.
4 ADVANTAGES &
CHALLENGES
4.1 Advantages
IVRS can provide government services to the
public 24/7.
IVR frees department staff from repetitive
functions (like data entry and monitoring the
phones) and enables them to address customers
instead as inputs can be recorded automatically.
IVR can help decrease the amount of paper that
a department uses.
Real time service tracking and grievance
handling in local language.
4.2 Challenges
The greatest challenge of IVR systems is that
many people simply dislike talking to machines.
Accuracy of ASR and TTS in countries with
EfficientUseofVoiceasaChannelforDeliveringPublicServices
629
multiple languages.
IVR call quality.
Good script and menu design.
Maintain/Improve customer satisfaction
5 CASE STUDY - mKISAN
mKisan, Department of Agriculture & Cooperation,
Ministry of Agriculture, Government of India
initiative, is a mobile based agriculture interactive
advisory service consisting of agriculture advisory
from experts on crop and livestock such as insects,
diseases and nutrition, agro bulletins, market info of
crop prices, weather forecast and a farmer helpline.
Video based dissemination of agriculture advisory
and best practices will also be tested under this
project. Mobile based feedback mechanism and
farmer knowledge sharing tools has been developed
and deployed.
5.1 mKisan IVRS Outbound Calls
These are used for obtaining feedback from farmers
regarding the advisories which they are receiving
from experts about their query and also on the
quality of information being given to them by KCC
(Kisan Call Centre) agents. A farmer can rate the
advisory or answer given by KCC agent on the scale
of 1 to 5. This service is available in 12 different
Indian Languages (mkisan.gov.in).
5.2 mKisan IVRS Inbound Calls
Farmers or all other stakeholders can call on one
number for obtaining crop information, weather
information, commodity prices from a predefined
menu and other also for giving useful feedback on
the services or for obtaining any information from a
predefined menu (mkisan.gov.in).
Figure 4: Total Calls by Kisan Call Centre.
Figure 4 shows the remarkable growth of
acceptance of government services depicting
number of calls both inbound and outbound
since 2009 to 2014.
5.3 Kisan Call Centres
Aim of the KCC is to answer farmers' queries on a
telephone call in their own language and dialect.
Call Centres are working in 14 different locations
covering all the States of India. A countrywide
common eleven digit Toll Free number 1800-180-
1551 has been allotted for Kisan Call Centre. This
number is accessible through mobile phones and
landlines of all telecom networks including private
service providers. Replies to the farmers' queries are
given in 22 local languages. Kisan Call Centre
agents known as Farm Tele Advisor (FTAs), who
are graduates or above in Agriculture or allied
disciplines with excellent communication skills in
respective local language respond to the farmers
queries instantly.
Figure 5: Total Calls IVRS Based Rating System of
mKisan.
Figure 5 shows the statistics of mKisan based agent
rating system. Figure contains total call generated by
the mKisan; how many of them picked, rated by
users and not replied. Through the mKisan IVRS
system, it has been envisaged that access to
agricultural services can be highly useful to the
interior of rural areas where penetration of another
communication medium is very less and access to
internet is very limited.
6 CONCLUSIONS
This paper presents a comprehensive study of how
the voice can be used as the new channel for
delivering the citizen services. Use of new edge
ICEIS2015-17thInternationalConferenceonEnterpriseInformationSystems
630
technologies like ASR, TTS, ACD, etc. make the
voice channel more efficient and effective, also the
introduction of automation reduces the effort. Case
study on mKisan shows that the voice can become
an effective channel for delivering citizen services.
The high tele-density and penetration of mobile
subscription in all parts of the country, IVR Systems
can be very useful to citizen, especially those in
rural areas where internet accessibility is restricted
and literacy rate is very low. IVRS will immensely
benefit sections of the society such as
senior
citizens, poor populace, women, physically
challenged populace, etc. as telephone shall provide
an easily accessible access channel for Government
services.
With the advancement in the technologies, all
government services can be made available through
a nationwide single number. The lengthy IVRS
menus can be reduced through advanced ASR
systems and services can be identified directly by
recognizing the user voice. TTS systems can be
employed to respond to user in human voice.
ACKNOWLEDGMENTS
We are thankful to Dr. Zia Saquib (Executive
Director, C-DAC, Mumbai) and Mobile Seva team
of C-DAC, Mumbai for their direct as well as
indirect contribution for this paper. We also thank
the anonymous reviewers for their valuable insights
and comments.
REFERENCES
Anne Bouverot, 2012. “A Keynote Address, GSM.
Association Mobile World Congress”.
Agarwal, S., Kumar, A., Nanavati, A. A., and Rajput, N.,
2009. “Content Creation and Dissemination by-and-
for Users in Rural Areas”. In International Conference
on Information and Communication Technologies and
Development.
Sherwani, J., Ali, N., Mirza, S., Fatma, A., Memon,
Y., Karim, M., Tongia, R., and Rosenfeld, R., 2007.
Health line: Speech-based access to health information
by low-literate users. In International Conference on
Information and Communication Technologies and
Development.
Mishra A., Economic Times, 2010. “Using the mobile to
track midday meal scheme”.
Grover, A., Calteaux, K., and Barnard, E., 2012. “A. voice
service for user feedback on school meals”. ACM
DEV '12 Proceedings of the 2nd ACM Symposium on
Computing for Development.
Vashistha, A., William Thies, 2012. “IVR Junction:
Building Scalable and Distributed Voice Forums in the
Developing World”. Microsoft Research India
Microsoft Research India.
Richard C. Dorf., 2006. “Circuits, Signals, and Speech and
Image Processing”.
S. B. Magre, P. V. Janse, R. R. Deshmukh, Volume 4,
Issue 2, February 2014, ISSN: 2277 128X. A Review
on Feature Extraction and Noise Reduction
Technique”.
Craig McTaggart, Tim Kelly, JUNE 2000. “IP
ELEPHONY WORKSHOP”. ITU NEW INITIATIVES
PROGRAMME—GENEVA.
http://www.lyle-kenya.com/main/ivr-interactive-response-
system/
http://www.ivrsdevelopment.com/ivrs_healthcare.htm.
http://mkisan.gov.in/KCC/CallResponse.aspx.
http://en.wikipedia.org/wiki/List_of_countries_by_literacy
_rate, accessed on January 2015.
EfficientUseofVoiceasaChannelforDeliveringPublicServices
631