Efficient Use of Voice as a Channel for Delivering Public Services

Kapil Kant Kamal

, Manish Kumar

, Bharat Varyani

and Kavita Bhatia

Centre for Development of Advanced Computing, Mumbai, India

Department of Electronics and Information Technology, Delhi, India

Keywords: Automated Call Distributor Automated Speech Recognition, Computer Telephone Integration, Interactive

Voice Response System, Text to Speech.

Abstract: Delivering the information and services to the citizen is a key task of Government. It is the responsibility of

the government to keep their citizens informed and deliver public services to them on timely basis. This

information required for making critical decisions and forming any opinion. For good governance and

transparency, it is very essential that the services and information is delivered timely. Delivering

information and services through conventional methods like paper forms, e-Forms have problems in

countries having large section of population illiterate. So, more efficient methods need to be employed for

the information sharing and data capture. With live human interaction and local language support, an

Interactive Voice Response Systems (IVRS) can be an effective method through which data can be captured

and information about the services can be shared even to the illiterate population. This paper discusses the

issues involved in the implementation of IVR system and making voice as a channel in delivering services

to the citizen. This paper is based on the investigation done for finding the potential of an IVRS services and

it also discusses the real time IVRS requirements for successful implementation of Govt projects and how

IVR systems will increase the acceptability, reduces the query-time of citizen and for making public

delivery systems more efficient. We propose a nationwide single number for accessing all Govt. services on

user local language. Further, it also includes the case study of Department of Agriculture & Cooperation,

Ministry of Agriculture, depicting how IVR system has helped farmers. Such IVRS may be replicated by

other Govt. department wherever necessary at customer ease.

1 INTRODUCTION

Countries around the world are making full

utilization of the ICT tools to deliver the government

services electronically and have started offering

transactional services. In many developing countries,

governments are facing difficulty in delivering

public services in rural areas due to lack of literacy.

The oldest and most natural means of information

exchange between human beings is voice and with

recent advancements in the technologies, automated

processes and system has made voice channel to be

strong enough for reaching out to the citizens. Voice

has some advantages over the conventional methods

of information capturing and sharing. An IVR

system with other support systems such Automated

Speech Recognition (ASR), Text to Speech (TTS)

can be employed to enable Voice as a new channel

of delivering public services. The literacy rate in the

rural sections in developing countries is still

comparatively very low compared to the urban

population and given deep penetration of the mobile

subscription, in the rural section; the voice based

delivery of services can be very effective. With the

support of multiple languages voice can be of great

means in countries like India where over 20

languages are spoken. Most of the world’s 3.6

billion mobile subscribers (

Anne Bouverot, 2012)

from the developing nations use their mobile phones

primarily for calls. The IVR services can be used in

diverse domains, including news and information

feeding to citizen, discussion on agricultural (like

information about market, weather and crop

advisory agents on call, expert system of

recommendation for fertilizer), community dialogue

(

Agarwal 2009) access to health information

(

Sherwani et al., 2007) group voice calling for

information distribution over the citizens of a large

geographical region at once, feedback on school

meals (Mishra, 2010) (

Grover, 2012), etc.

In this paper, we have outlined the flow of

626

Kamal K., Kumar M., Varyani B. and Bhatia K..

Efﬁcient Use of Voice as a Channel for Delivering Public Services.

DOI: 10.5220/0005375806260631

In Proceedings of the 17th International Conference on Enterprise Information Systems (ICEIS-2015), pages 626-631

ISBN: 978-989-758-097-0

 2015 SCITEPRESS (Science and Technology Publications, Lda.)

technical approaches for creating a scalable IVRS

platform for delivering the public services anytime

anywhere. We have also introduced the combined

use of different voice architecture which can be used

for creating effective voice platform. Compared to

prior solutions, IVR platform offers two key

novelties. First, it seamlessly connects Internet based

users with phone-based users. Both sets of users can

contribute and retrieve audio messages from a

repository in the automated IVR system.

Departments can connect to IVRS through internet

to post audio recordings for automatic broadcast to

mobile phones. The second uniqueness of IVR

System is that it scales across geographically

distributed access points, enabling affordable access

via local phone calls (

Vashistha, 2012).

2 TECHNOLOGIES USED FOR

AUTOMATED AND

INTELLIGENT IVR SYSTEM

IVR systems may become, primarily, an assistive

device for callers and agents during a conversation.

IVR will support in making a conversation more

meaningful by collecting and conveying information

to one or both the parties. In that sense, IVR will be

a thin intermediate layer that can amplify the impact

of talk by making it more interactive, and by

providing context. Some of the technologies used in

enhancing IVR systems are listed below.

2.1 Text to Speech (TTS) Systems

The goal of TTS is to convert input text to natural

sounding speech to transmit information from a

machine to a person, for example, citizen dials an

IVRS number to check the status of his/her

application he / she had been filed, and the IVRS

reads out the status fetched from the concerned

department server by converting text received into

speech using TTS engine. Such systems string the

words together to be spoken in isolation and the

artefacts of such a scheme are being often

perceptible. The methodology used in TTS is to

exploit audio representations of speech for

synthesis, together with linguistic analyses of text to

extract correct pronunciations (what is being said in

given context in terms of region, language) and

prosody in context (‘‘melody’’ of a sentence; how it

is being said). Synthesis systems are commonly

evaluated in terms of three characteristics: accuracy

of rendering the input text (does the TTS system

pronounce, e.g., acronyms, names, URLs, email

addresses, a knowledgeable human would?),

intelligibility of the resulting voice message

(measured as a percentage of a test set that is

understood), and perceived naturalness of the

resulting speech (does the TTS sound like a

recording of a live human?). Text to Speech system

can be used to broadcast citizen services like

weather information, crop details, etc. to farmers,

status updates, etc. in addition to banking services,

telecom services (

Richard,2006).

2.2 Automatic Speech Recognition

Automatic Speech recognition which means

understanding voice input and performing any

required task or the ability to match the voice input

against a provided or acquired vocabulary. The task

is to get a computer to understand the spoken

language. By “understand” we mean to react

appropriately and convert the input speech into

another medium e.g. text. Speech recognition is

therefore sometimes referred to as speech-to-text

(STT). The Automatic Speech Recognition system is

very important in delivering government services as

there are hundreds of services and it is extremely

difficult to access these services through a common

number without an accurate ASR system.

2.3 IP-Telephony

With the introduction of new edge technologies, the

Internet Protocol (IP) based networks are

increasingly being used as an alternative to the

traditional circuit-switched telephone network. The

different flavours of IP Telephony provide varying

degrees, alternative means of originating,

transmitting, and terminating voice and data

transmissions which would otherwise be carried by

the public switched telephone network (PSTN)

(Craig, 2000).

2.3.1 IP based Audio and Video Calling

Audio and Video calling can be done over IP

network. Through the use of Session Initiation

Protocol (SIP) the point-to-point communications

are no longer restricted to voice calls but can now be

extended to multimedia technologies such as video.

The IVR systems with live video of the caller

provide the ability to have true value interaction

with the caller. With the introduction of full-

duplex video, IVR will allow systems such as the

ability to read emotions and facial expressions. This

EfficientUseofVoiceasaChannelforDeliveringPublicServices

627

video calling can be the future of remote biometrics

detection such as IRIS scan or

other biometric means. Recordings of the caller may

be stored to monitor certain transactions, and can be

used to reduce identity fraud (lyle-kenya.com).

2.3.2 Unified Communications in the SIP

Contact Centre

With the introduction of SIP contact centres

(automated menu driven SIP systems), traditional

barriers to automation are breaking down. As calls

are queued in the SIP contact centre, the IVR system

can provide treatment or automation, wait for a fixed

period, or play music. Inbound calls to a SIP contact

centre must be queued or terminated against a SIP

end point; SIP IVR systems can be used to replace

agents directly by the use of applications deployed

using BBUA (Back to Back User Agents).

2.4 Automatic Call Distributor (ACD)

In telephony, an automatic call distributor (ACD)

or automated call distribution system is a device or

system that distributes incoming calls to a specific

group of terminals that has agents based on

customer's need, type, and agent skill set. It is often

being a part of computer telephony integration (CTI)

system. An automatic call distributor (ACD) is often

the first point of contact when calling many larger

businesses. An ACD uses digital storage devices to

play greetings or announcements, but typically

routes a caller without prompting for input. An IVR

can play announcements and request an input from

the caller (lyle-kenya.com).

3 SERVICES THROUGH IVRS

In nations like India, where there are 22 different

official languages and around 25% of population is

still illiterate (en.wikipedia.org), development of

IVR System can be of powerful medium for

delivering public services to citizens.

IVRS application can be used to offer Citizen to

Government (C2G) and Government to Citizen

(G2C) services his / her local language. There are

various government services which are compelling

and receiving a lot of enquiries from the citizens.

Keeping the magnitude of the population in mind, it

is not surprising that these services draw a huge

volume of enquiries. These enquiries are nothing but

an overhead for the government and such processes

if automated can reduce the undue overheads.

3.1 Inbound Interactive Voice

Response

In an Inbound IVRS service, the citizen can call on

the interactive voice response system. The IVR

system has predefined menu for users which have

the introduction of service and the information

gathering menu as well.

IVRS can be accessed as an auto receptionist to

attend the call of your customers. It may guide them

to the desired department or to the desired person, or

may register or respond to their query and

complaints (ivrsdevelopment.com).

Menu Options:

 Messages need to be kept short, and should

include some prominent key words

 The function need to be announced followed by

the key required to activate it

 Provision to the customers for two or three

chances to select an option

 The system should transfer a caller to an

operator if no option is chosen

 Provision for repeat facility, keeping the best

practice for the repeat to occur automatically

rather than relying on the customer selecting to

hear the options again.

For example, IVRS can be used in the healthcare

sector like hospitals, physicians, nursing homes,

diagnostic laboratories, pharmacies, medical device

manufacturers and other components for efficient

workflow. Broadly healthcare sector implement

IVRS application for the following workflow

(ivrsdevelopment.com) .

 IVRS Auto Attendant

 Patient Information using IVRS

 Patient and Other Records Management

Figure 1: Example Flow of Inbound IVR System.

Figure 1 shows the example flow of the Inbound

IVR system. Citizen dials a predefined number for

accessing the service that he/she wants to avail.

Network signal reaches through PSTN network to IP

ICEIS2015-17thInternationalConferenceonEnterpriseInformationSystems

628

network and then reaches to IP calling stack server.

This IP calling stack server connected with the

different service providers. By identifying DTMF

(Dual Tone Multiple Frequency) signal calling

server provide the appropriate response to user.

3.2 Outbound Call Notification Voice

Response

Outbound IVR system is used for generating call for

notification and advisory to the users. It is more

effective in rural areas where illiteracy is the major

problem message with use of local regional language

making voice IVR channel as a strong

communication mode. Use of TTS at WEB

application end makes the use of the services more

flexible with cross language Text to Speech message

generation and transmission.

Figure 2: Example Flow of Outbound IVR System.

Figure 2 shows the flow of an IVR system for

generating outbound call. A user can generate the

single or multiple calls through application server

which is connected to IP network. A network

interface card converts this IP signal to telecom

signal and transfer it to PSTN and a call rings on a

requested number.

3.3 IVR Systems with Human Agents

The IVR systems with human interaction can add

more value for citizen satisfaction. The Government-

citizen relationships where the hardships

experienced in browsing through IVRS menu by

citizens are done away with the introduction of

human agents. Call centre with the local language

agent can make a call centre available and reachable

to the local citizen. User can call over PSTN

network or over IP Network to IP Calling stack

server which can convert the telephony signalling to

IP signalling.

Figure 3 shows the example flow of call centre

with human agents. Citizen dials a predefined

number, network signal reaches through PSTN

network to IP network and then reaches to IP calling

Figure 3: Example flow of Call Centre with human agents.

stack server. This IP calling stack server connected

with the different agents directly or with automated

call distribution system. By identifying DTMF (Dual

Tone Multiple Frequency) signal calling server

provide the rings to appropriate and available agent.

Examples of some IVR Systems:

A. IVR system to inform mothers during the

pregnancy period about their day care health

precautions, about the vaccination of their new born

babies, about the upcoming natural diseases and

their cure etc.

B. Campaigning for social causes like polio

vaccinations, weather forecasting, and disaster

management can be done through IVRS.

4 ADVANTAGES &

CHALLENGES

4.1 Advantages

 IVRS can provide government services to the

public 24/7.

 IVR frees department staff from repetitive

functions (like data entry and monitoring the

phones) and enables them to address customers

instead as inputs can be recorded automatically.

 IVR can help decrease the amount of paper that

a department uses.

 Real time service tracking and grievance

handling in local language.

4.2 Challenges

 The greatest challenge of IVR systems is that

 many people simply dislike talking to machines.

 Accuracy of ASR and TTS in countries with

EfficientUseofVoiceasaChannelforDeliveringPublicServices

629

multiple languages.

 IVR call quality.

 Good script and menu design.

 Maintain/Improve customer satisfaction

5 CASE STUDY - mKISAN

mKisan, Department of Agriculture & Cooperation,

Ministry of Agriculture, Government of India

initiative, is a mobile based agriculture interactive

advisory service consisting of agriculture advisory

from experts on crop and livestock such as insects,

diseases and nutrition, agro bulletins, market info of

crop prices, weather forecast and a farmer helpline.

Video based dissemination of agriculture advisory

and best practices will also be tested under this

project. Mobile based feedback mechanism and

farmer knowledge sharing tools has been developed

and deployed.

5.1 mKisan IVRS Outbound Calls

These are used for obtaining feedback from farmers

regarding the advisories which they are receiving

from experts about their query and also on the

quality of information being given to them by KCC

(Kisan Call Centre) agents. A farmer can rate the

advisory or answer given by KCC agent on the scale

of 1 to 5. This service is available in 12 different

Indian Languages (mkisan.gov.in).

5.2 mKisan IVRS Inbound Calls

Farmers or all other stakeholders can call on one

number for obtaining crop information, weather

information, commodity prices from a predefined

menu and other also for giving useful feedback on

the services or for obtaining any information from a

predefined menu (mkisan.gov.in).

Figure 4: Total Calls by Kisan Call Centre.

Figure 4 shows the remarkable growth of

acceptance of government services depicting

number of calls both inbound and outbound

since 2009 to 2014.

5.3 Kisan Call Centres

Aim of the KCC is to answer farmers' queries on a

telephone call in their own language and dialect.

Call Centres are working in 14 different locations

covering all the States of India. A countrywide

common eleven digit Toll Free number 1800-180-

1551 has been allotted for Kisan Call Centre. This

number is accessible through mobile phones and

landlines of all telecom networks including private

service providers. Replies to the farmers' queries are

given in 22 local languages. Kisan Call Centre

agents known as Farm Tele Advisor (FTAs), who

are graduates or above in Agriculture or allied

disciplines with excellent communication skills in

respective local language respond to the farmers

queries instantly.

Figure 5: Total Calls IVRS Based Rating System of

mKisan.

Figure 5 shows the statistics of mKisan based agent

rating system. Figure contains total call generated by

the mKisan; how many of them picked, rated by

users and not replied. Through the mKisan IVRS

system, it has been envisaged that access to

agricultural services can be highly useful to the

interior of rural areas where penetration of another

communication medium is very less and access to

internet is very limited.

6 CONCLUSIONS

This paper presents a comprehensive study of how

the voice can be used as the new channel for

delivering the citizen services. Use of new edge

ICEIS2015-17thInternationalConferenceonEnterpriseInformationSystems

630

technologies like ASR, TTS, ACD, etc. make the

voice channel more efficient and effective, also the

introduction of automation reduces the effort. Case

study on mKisan shows that the voice can become

an effective channel for delivering citizen services.

The high tele-density and penetration of mobile

subscription in all parts of the country, IVR Systems

can be very useful to citizen, especially those in

rural areas where internet accessibility is restricted

and literacy rate is very low. IVRS will immensely

benefit sections of the society such as

senior

citizens, poor populace, women, physically

challenged populace, etc. as telephone shall provide

an easily accessible access channel for Government

services.

With the advancement in the technologies, all

government services can be made available through

a nationwide single number. The lengthy IVRS

menus can be reduced through advanced ASR

systems and services can be identified directly by

recognizing the user voice. TTS systems can be

employed to respond to user in human voice.

ACKNOWLEDGMENTS

We are thankful to Dr. Zia Saquib (Executive

Director, C-DAC, Mumbai) and Mobile Seva team

of C-DAC, Mumbai for their direct as well as

indirect contribution for this paper. We also thank

the anonymous reviewers for their valuable insights

and comments.

REFERENCES

Anne Bouverot, 2012. “A Keynote Address, GSM.

Association Mobile World Congress”.

Agarwal, S., Kumar, A., Nanavati, A. A., and Rajput, N.,

2009. “Content Creation and Dissemination by-and-

for Users in Rural Areas”. In International Conference

on Information and Communication Technologies and

Development.

Sherwani, J., Ali, N., Mirza, S., Fatma, A., Memon,

Y., Karim, M., Tongia, R., and Rosenfeld, R., 2007.

Health line: Speech-based access to health information

by low-literate users. In International Conference on

Information and Communication Technologies and

Development.

Mishra A., Economic Times, 2010. “Using the mobile to

track midday meal scheme”.

Grover, A., Calteaux, K., and Barnard, E., 2012. “A. voice

service for user feedback on school meals”. ACM

DEV '12 Proceedings of the 2nd ACM Symposium on

Computing for Development.

Vashistha, A., William Thies, 2012. “IVR Junction:

Building Scalable and Distributed Voice Forums in the

Developing World”. Microsoft Research India

Microsoft Research India.

Richard C. Dorf., 2006. “Circuits, Signals, and Speech and

Image Processing”.

S. B. Magre, P. V. Janse, R. R. Deshmukh, Volume 4,

Issue 2, February 2014, ISSN: 2277 128X. “A Review

on Feature Extraction and Noise Reduction

Technique”.

Craig McTaggart, Tim Kelly, JUNE 2000. “IP

ELEPHONY WORKSHOP”. ITU NEW INITIATIVES

PROGRAMME—GENEVA.

http://www.lyle-kenya.com/main/ivr-interactive-response-

system/

http://www.ivrsdevelopment.com/ivrs_healthcare.htm.

http://mkisan.gov.in/KCC/CallResponse.aspx.

http://en.wikipedia.org/wiki/List_of_countries_by_literacy

_rate, accessed on January 2015.

EfficientUseofVoiceasaChannelforDeliveringPublicServices

631