The Non-Force Interaction Theory for Reflex System

Creation with Application to TV Voice Control

Iurii Teslia

, Nataliia Popovych

, Valerii Pylypenko

and Oleksandr Chornyi

Kyiv National University of Constuction and Archtecture, Povitroflotsky Avenue, Kyiv, Ukraine

International Research and Training Center for Information Technologies and Systems,

National Academy of Sciences of Ukraine, Kyiv, Ukraine

Keywords: Non-Force Interaction, Information Theory, Speech Recognition, Reflex Systems, Self-organization,

Phoneme-by-Phoneme Recognizer.

Abstract: The paper presents the aspects and conclusions of the theory of non-force interaction, discloses the

possibilities of its application to the creation of artificial intelligence systems. The method of calculation of

the reaction on the non-force actions in the sphere of intellectual activity and the universal model of

intellectual reflex system are proposed. On this basis the reflex voice system for control of technical devices

is developed. The article describes the system and results of its usage for controlling the TV. In particular:

the special features of controlling TV’s functionality with voice commands; ignoring the commands, that

are not addressed to the system; learning new commands and desired reactions on user's requests; adjusting

system's behaviour based on user’s speech. The work is aimed to demonstrate the possibilities of the theory

of non-force interaction in the field of study of the mechanisms of the brain, and creation on this basis

artificial systems that approach in terms of its “intelligence” to human intelligence.

1 PROBLEM STATEMENT

Despite of the investment of considerable amount of

work and financial resources into the speech

recognition research there are still no effective

multipurpose tools for understanding of oral speech.

Without solving of this problem it is difficult to talk

about successful automation of all spheres of human

activity. The lack of good solutions, which can be

generally valid in many areas of human activity,

requires searching for a fundamentally new

approach to tackle this task. The theory of non-force

interaction (Teslia, 2005) provides significant help

for finding a solution of this problem. It gives a

formal basis for describing processes of information

exchange in biological systems in general and in the

human brain in particular. This gives a prospect for

the development of new tools for solving many

problems in cybernetics.

1.1 Analysis of the Main Research and

Publications

As examined by Anusuya and Katti (2009, p. 181),

in the area of development of human speech

processing systems a number of interesting methods

and approaches were developed. Global giants of the

software industry – Nuance Communications,

Google, Apple, have implemented the most effective

ones within commercially successful systems.

The most notable achievement in this area is a

joint product of Apple and Nuance – the personal

assistant and a question-and-answer system Siri. The

application uses natural language processing for

answering questions and offering advice.

Developing Siri, Apple used the results of 40 years

of research conducted by “Artificial Intelligence

Centre” and the work of the research groups from

the most famous universities in the world. This study

is, perhaps, the largest artificial intelligence project

to date. Google also offers a very high quality

solution for speech recognition and voice control for

the operating system Android – “Voice Actions”

system. All of these solutions are designed for

mobile platforms.

These companies have created a voice control

system for TV – the company Nuance with its

Dragon TV platform, and Google with Google TV.

Both systems are a supplement to the TV,

implemented as a separate hardware module with the

288

Teslia I., Popovych N., Pylypenko V. and Chornyi O..

The Non-Force Interaction Theory for Reﬂex System Creation with Application to TV Voice Control.

DOI: 10.5220/0004754702880296

In Proceedings of the 6th International Conference on Agents and Artiﬁcial Intelligence (ICAART-2014), pages 288-296

ISBN: 978-989-758-015-4

 2014 SCITEPRESS (Science and Technology Publications, Lda.)

corresponding software platform.

1.2 The Unsolved Part of the Problem

All of these systems have one distinguishing feature

in common – they are built using an approach based

on large amounts of training data and the use of

“cloud” computing centres (Jyothi, Johnson, Chelba

and Strope, 2012, p. 41). This requires a permanent

Internet connection to transmit the speech to remote

server for processing. Also, the systems listed above

poorly support international languages, e.g.

Ukrainian and Russian.

1.3 The Purpose of the Article

It is therefore necessary to develop speaker-

independent speech recognition tools, which could

be easily implemented and adapted to different

languages and will not require large computing

resources. This article is focused on presenting a

new technology for creating systems able to respond

to commands expressed in natural language.

2 THE MAIN MATERIAL OF

RESEARCH

As has been shown and proven by biologists,

development of reflexes to various influences lies at

the basis of living beings functioning. Reflex (from

Lat. reflexus – reflected) – is a stereotypical reaction

of the living organism to a stimulus that passes

involving the nervous system (Purves, Williams,

White and Mace, 2004).The assumption of the reflex

nature of the higher centres of the brain was first

developed by scientist-physiologist I. Sechenov.

Before him, physiologists and neurologists did not

dare to raise the question of the possibility of a

physiological analysis of the mental processes – this

was the field of expertise of the psychologists. The

ideas of I. Sechenov were further developed in the

works of I. Pavlov, who discovered the methods of

the objective experimental research of the brain

cortex functions, developed the method of

generation of conditioned reflexes and worked out a

theory of higher nervous activity. A great

contribution to the formation of the theory of

reflexes was made by Charles S. Sherrington (Nobel

Prize in Physiology and Medicine, 1932). He

discovered reflexes coordination, mutual inhibition

and facilitation.

At the level of simple biological objects the

reflexes allow the production of the “right”

response to the state of the environment. At the

human level reflexes are developed not only as a

reaction on the physical influence, but also as a

reaction on the informational impact in the socio-

political sphere, on actions of other people, such as a

teacher at the school, colleagues, etc. Of course,

these reflexes are very complex, ambiguous and

cannot be represented by a simple stimulus–response

model. This model can be rather presented as a “set

of actions”–“the most favourable reaction from the

standpoint of a positive attitude”.

The principles of development of artificial

intelligence systems, which are based on these ideas,

are examined in the theory of non-force interaction

(Teslia, 2013b). The practical output of this theory is

creation of reflex intelligent systems. In particular,

the systems for evaluation of investment proposals

in development; natural language access to

databases; assessment of the impact of harmful

substances in the water resources of the region on

the health of the population; predicting of the

outcomes of sport events. These systems are

described by Teslia in his work (2010). The main

advantages of these systems are the ease of their

creation and the effectiveness of the solutions of

various intellectual tasks.

2.1 Fundamentals of the Theory of

Non-force Interaction

The main idea of the theory of non-force interaction

(NFI) is that any interaction in the Nature first leads

to a change in the material object’s internal

organization (introformation), which in its turn leads

to a change in object’s behavior (motion) (Teslia,

2005). For one-dimensional motion the

introformation is represented by the geometric

model (Fig. 1) with two domains of displacements

(DD).

As measures, which “generate” motion, the

difference and sum of the sizes of DD are taken

(Teslia, 2005). Let's define 







as object’s

certainty about the displacement in the direction ,









as object’s awareness about the

displacement in the direction Z.

Using the relativistic mechanics we obtain the

ratio between velocity of drift (), probability of

displacement (), certainty () and awareness () of

the material objects (Teslia, 2013b):





21



∙⇒



2



(1)

TheNon-ForceInteractionTheoryforReflexSystemCreationwithApplicationtoTVVoiceControl

289









1



(2)

0.5





1



1



2

(3)









1

(4)

0.5



2

(5)

Figure 1: Geometrical interpretation of introformation.

Based on the laws of momentum conservation

the sum of certainties is obtained (Teslia, 2013a):









(6)

where 



– the total certainty of objects in a closed

system; 



– certainty of material object 



From the formula of the relativistic addition of

velocities we obtain operation of the addition of

certainties (Teslia, 2005):









∙







∙



⇒









∙







∙



(7)

where 



– difference of certainties of the objects





and 



; 



– certainty of the object 



; 



–

certainty of the object 



; 



– awareness of the

object 



; 



– awareness of the object 



In the basis of NFI theory lays an assumption

that our brain functioning is based on the same laws.

The given interpretation can be used to create

artificial intelligent systems, whose computing

elements interact with each other under the same

laws as everything interacts in nature!

Consider the example. Let the ball 



is moving

with speed 



in the direction of  (probability of

displacement is 



). The other ball 



catches it up

and hits it. As a result the ball’s 



speed changes

and is 



(probability of displacement is 



). If it

is being hit by the ball 



, which moves with speed





, than ball’s 



speed gets the value 



(probability of displacement is 



). As the punches

are absolutely elastic we can calculate the speed of

the ball 



after collision with two balls 



and 



and obtain the following:















,





(8)

Let’s now consider the non-force impact on the

intellectual system. Let the probability of some

behavior (response) is 



. But upon exposure of 



the probability of reaction becomes equal 













/





. And upon exposure of 



the

probability of reaction becomes equal 













/





. Then the probability of the reaction of





upon exposure of both 



and 

































/





,







/







(9)

And what is the probability of reaction upon

many exposures? Here can be used an algorithm that

comes up from the law of conservation of

momentum. Namely on this basis the reflexes in

living organisms can be generated.

The only criterion of truth is practice. To begin

with a theoretical model has been confirmed in a

series of computer experiments on natural-language

texts (Teslia, 2005). It turned out that the statistical

regularities in the lyrics in Russian correspond to the

given equations. In addition, a several reflex

intelligent systems were developed on this theory

(Teslia, 2013a). Therefore it is tempting to apply

NFI theory for creation reflex system with

application to TV voice control.

2.2 Introformation Method for Solving

the Problem of Speech

Understanding

The question arises whether it is possible to present

the process of voice interaction using the models

that are developed in the theory of non-force

interaction? Traditionally, the speech recognition

systems are based on the principle: “spoken

language”  “representation of speech as a set of

linguistic constructions”  “speech understanding”.

Based on the theory of non-force interaction the

other model of natural language recognition can be

suggested: “spoken language”  “estimation of

non-force (informational) impact on the reaction” 

“reaction (understanding or behaviour)” = REFLEX.

In a system that is built upon the NFI theory, the

ICAART2014-InternationalConferenceonAgentsandArtificialIntelligence

290

recognition process of an informative part of a voice

command could be built in a different way. Let us

assume that any repetitive acoustical phenomenon of

speech (and not just speech) can be associated with

some symbol of a finite alphabet. This

correspondence can be established for quite similar

sounds, in order to reduce the size of the alphabet.

Thus any voice command can be associated with a

phrase consisting of alphabet symbols. This is very

similar to phonetic transcription, but with this

approach we are not limited by the phonemes of a

particular language or by the acoustic phenomena of

speech. Then from the voice command system's

perspective, each symbol from a phrase like that,

especially their combinations – is an “impact” on the

system, and consequentially the appropriate action

of the system would be a “reaction”.

At the learning stage a voice command will be

transformed into a phrase consisting of alphabet

symbols and this phrase will be associated with the

desired system's reaction. Thus, using the proposed

introformation method, it will be possible to

evaluate an impact of symbol's combinations (which

were taken from the phrase) on the selected reaction.

So it is possible to build so-called “base of reflexes”

that will store a magnitude of the impact of some

sequence of symbols on the selection of a specific

system's reaction.

At the recognition stage combinations of

symbols will be used to determine the most likely

reaction. The magnitude of the impact of symbols'

combinations will be taken from the already trained

“base of reflexes”.

This paper proposes to build a voice command

understanding system using the above-stated

principles.

Let’s adapt the proposed by Teslia (2013b)

introformation method for solving the problem of

speech understanding:

1. Formation of the base of reflexes (RB)

showing the statistical information on the pair

“external influence (the utterance)”  “the correct

response”.

2. Based on the probabilities of reactions, stored

in the RB, the certainty (Teslia, 2010) of reflex

voice-activated control system (RVCS) is calculated

relatively to these reactions (3). Let's use the

following notations:





– unconditional probability of reaction ;





/



 – probability of reaction 

providing that there was an action 



(some fragment

of human speech):

















0.5







1





1







2,



0.5

0.5







1





1







2,



0.5



0,



(10)

where



,1, – is the definiteness of reaction 

providing that there was an action on the RVCS 





– is the definiteness of reaction in the case if

there was no action on the RVCS).

3. The information awareness of RVCS in

relation to these reactions is calculated based on the

known probabilities (2):













1







0,





(11)

where 



– awareness of the system about the

reaction , upon the influence 



(



– the system’s

awareness regarding the reaction  in case of the

absence of action on the system).

4. Using Teslia’s (2013b) method the total

increment of the certainty of the system’s action

based on all the impacts on the system can be

calculated using (6) and (7):

∆



∙







∙















∙











∙































(12)

where ∆ – the total increment of certainty of the

RVCS’s reaction.

5. The calculation of the increment of the

RVCS’s awareness using (4):

∆



∆



1

(13)

where ∆ – the increment of the system’s awareness.

6. The calculation of the new certainty of the

reaction  using (7):





∆∙







∙∆

(14)

where 



– a new certainty of the reaction .

7. The calculation of a new awareness of the

RVCS using (4):















1

(15)

TheNon-ForceInteractionTheoryforReflexSystemCreationwithApplicationtoTVVoiceControl

291

where 



– a new awareness of the RVCS.

8. Estimation of the probability of the reaction 

using (5):









/



0.5





2



(16)

where 







/



– estimation of the probability

of the reaction  under actions 



,1,.

The idea of the above method is that it points to

the expected “response” on impact, the adequacy of

which complies with the well-known and

experimentally verified physical laws. By

assumption, the interaction of neurons is based on

the same laws and works in accordance with the

proposed model of non-force interaction. On this

basis it is possible to create artificial introformation

processors that work as neurons do. Not as classic

and well known in cybernetics formal neurons that

are not more similar to natural neurons, as a paper

boat is similar to an ocean vessel. More

sophisticated and complex structures can be

developed on this basis, the structures that respond

to stimulation (actions). All this is embodied in

reflex intelligent systems capable of accumulating

information about the operational environment and

developing an adequate response (reflexes) on

everything in this environment.

The result of the work of the authors' research

team in 2012 is the reflex voice control system,

created using the above method. Let's review its

features.

3 REFLEX VOICE-ACTIVATED

CONTROL SYSTEM FOR

TECHNICAL DEVICES

Reflex voice-activated control system (RVCS) is

designed to work using free-language input

(Pylypenko, 2007) of control commands and content

into the technical device. RVCS increases the

efficiency of the technical device and provides voice

interface, which relieves the operator's hands.

The system consists of two main modules: an

automatic phoneme-by-phoneme recognizer and the

Kernel module.

Phoneme-by-phoneme recognizer (Pylypenko,

2009) is an external binary application. The

phoneme-by-phoneme recognizer carries out

transformation of a voice command into a sequence

of symbols from the finite alphabet.

RVCS implements the introformation method of

reflex generation mentioned above.

Input data for the system is voice command

represented as sound wave. Output data is a control

action on a control object e.g. execution of an

identified command according to the parameters

specified by voice.

While operating, the system generates the

necessary visual and sound informational messages

what gives the possibility to trace the process of

commands identification, responses to them and,

besides this, if necessary to change system’s

behaviour in real time.

Let’s examine main parts of the system and its

operation scheme with application to TV voice

control in Fig. 2.

Figure 2: Structure of RVCS.

The Phoneme-by-Phoneme Recognizer. The

phoneme transcribing algorithm developed by

Pylypenko (2009) builds a phoneme sequence for a

speech signal regardless to the dictionary. The

constructed phoneme generative automata (Fig. 3)

can synthesize all possible continuous speech model

signals for any phoneme sequence. Then the

phoneme-by-phoneme recognition of an unknown

speech signal is applied.

ICAART2014-InternationalConferenceonAgentsandArtificialIntelligence

292

Figure 3: The phoneme generative automata.

First step of the phonemes recognition is a

feature extraction. The speech signal is converted

into a sequence of vector parameters with a fixed 25

ms frame and a frame rate of 10 ms. Then each

parameter is pre-emphasized with filter:









10.97



(17)

Hamming window is applied. A fast Fourier

transform is used to convert time domain frames into

frequency domain spectra. These spectra are

averaged into 26 triangular bins arranged at equal

mel-frequency intervals. 12 dimensional mel-

frequency cepstral coefficients (MFCCs) are

obtained from cosine transformation and lifter. The

log energy is also added as the 13th front-end

parameter.

These 13 front-end parameters are expanded to

39 front-end parameters by appending first and

second order differences of the static coefficients.

Cepstral mean normalization was applied to deal

with the constant channel assumption.

Input to the feature extraction algorithm: a

digitized sound wave.

Output from the feature extraction algorithm: a

feature vector.

Also the phoneme-by-phoneme recognizer uses

an acoustic model. This model uses hidden Markov

Models (HMMs) with 64 mixtures Gaussian

probability density function for acoustic modelling.

Diagonal covariances Gaussians are used. All units

are modelled by 3 left-to-right states with skip

transition. 56 Russian context-free phonemes with

pause unit are chosen. For pattern matching the

Viterbi algorithm is used.

The experimental accuracy of finding phoneme

at the right place for known utterance equals to

approximately 70%.

Input to the recognizer: the digitized sound

wave, which carries the voice command.

Output from the recognizer: a set of phonemes,

which represents the voice command

Example of an input phrase: “Quickly, turn on

channel eighteen for me”.

Example of an output:

kwɪklitə rnɑntʃænə letinfɔ rmi.

The result of acoustic waves recognition is stored

in DB of Phonemes.

RVCS Kernel contains program implementation

of:

 The introformation method;

 The algorithm of allocation of combinations

of phonemes;

 The learning algorithm, which accumulate

statistics.

The Kernel uses DB of reflexes to store

magnitudes of influences of different phonemes’ sets

for choosing a reaction.

Configuration DB includes:

 The commands dictionary;

 Work protocol;

 Setup configuration.

Input to the Kernel:

 from the recognizer – the

set of phonemes;

 from the DB of reflexes – magnitudes of

influences of each phonemes' combinations

(from the input phrase) for choosing of all

possible reactions where these combinations

were used;

 from Configuration DB – a formal description

of an interfacing protocol for interaction with

TV and format of a command, which should

be sent to the TV for execution.

Output from the Kernel: an executable

сommand with parameters.

Example of output: Speaker (Teslia); Command

(Change channel); Channel name (); Channel

number: Tens (10), Units (8).

When a voice command is transformed into “a

command with parameters” it can be executed by an

adapter module, which performs control actions.

Adapter module contains program

implementation of the algorithm of control of the

technical device.

Input to the adapter: the command with

parameters transformed into the formula form.

Result: change of the parameters of the technical

device.

Example: turning on the 18

TV channel.

RVCS system is very simple. It implements

reflex behaviour model. It functions in two modes –

training and control. The base of reflexes is formed

in training mode (with a teacher). In control mode

RVCS produces a reaction to the speaker's appeal.

Also, in this mode the self-training is implemented

(if the speaker was not satisfied with system’s

reaction).

Every reflex class is implemented in separate

TheNon-ForceInteractionTheoryforReflexSystemCreationwithApplicationtoTVVoiceControl

293

system component. Classes of reflexes: announcer,

command, channel name, level of number, level of

tens, level of units.

Every system component can be represented as

separate “introformation neuron”. Input data – a

complete input set of phonemes and/or the reaction

of other “introformation neurons”. Output data – the

reaction that is passed to other “introformation

neurons” or to the technical device.

A structure of each component (reflex class) can

be represented as a combination of three tables (Fig.

4). The tables are required for storing combinations

of phonemes (P), which were found in all commands

earlier; all possible reactions of the system (R); and

relationships between combinations of phonemes

and reactions (L).

Figure 4: Structure of a component/reflex.

DB of reflexes can be represented as a

combination of such components (Fig. 5).

Let's consider each table one-by-one. We note

that the tables are filled during the training phase.

1. Table of phonemes (P). This table stores all

combinations of phonemes from 2 to 10 symbols

length, which were found in all voice commands

during the training. Also it stores a number of times

that a combination was encountered in all

commands.

So, during the training, a set of phonemes from

the example (kw ɪklitərnɑ ntʃænəletinfɔ rmi)

will be split on set of combinations consisting of 2

symbols (kw, wɪ, ɪk, kl, li, etc.), 3 symbols

(kwɪ , wɪk, ɪ kl, e tc.), 4 symbols and so on up to

10 symbols. Further, information about absolute

quantity of occurrences of such n-gramms in all

commands will be stored into this table (Table 1).

2. Table of reactions (R). It stores system’s

reactions. It contains: all possible reactions of

RVCS; how many times each reaction was found in

the training sample; certainty of each reaction

without considering what phonemes' combination

has influenced this reaction. Reaction “I don’t

know” ensures the system openness (Table 2).

Figure 5: Structure of the base of reflexes.

Table 1: Phonemes sets table fragment.

P. Id Set of Phonemes Quantity

1 kw 699

2 wɪ 500

3 kwɪ 412

4 wɪ k 388

5 itə r 155

6 t ə r n 123

Table 2: Reaction tables fragment.

R. Id Quantity

Reaction

Certainty

1 0 ** I don’t know ** -63,0178547

2 21 What do you need -13,4087956

3 24 I’m talking to you -12,5749908

4 106 Bless you -6,0310190

5 736 Turn on channel N -2,1567512

6 722 Turn on -7,3254428

7 131 Forward -5,4169182

8 39 Harmful -9,9272670

So at the training stage a user indicates that

combinations of phonemes from the example will

lead to choosing the “Turn on channel N” reaction, a

certainty of this reaction will be recalculated. Also a

counter of occurrences of this reaction (in the entire

training set) will be incremented.

3. Table of relationship (L) links table P and

table R. It contains information about:

 Quantity – how many times which reaction

was needed in case of some set of phonemes

ICAART2014-InternationalConferenceonAgentsandArtificialIntelligence

294

as input data;

 Certainty of the reaction conditioned by the

presence of a combination of phonemes,

connected with this set of phonemes (Table 3).

In training mode the mentioned above tables

accumulate information about what input phrases led

to which reactions. In control mode this information

is used for making the appropriate reactions to the

speaker’s appeals using the method, mentioned at

the beginning of the section.

Table 3: Connections table fragment.

P. Id R. Id Quantity Certainty

1 (kw ) 2 (what do you n.?) 1 1,16683067

1 (kw ) 5 (turn on ch.) 2 0,32652037

1 (kw ) 6 (turn on) 2 3,62813851

1 (kw ) 1 (don’t know) 0 25,6847898

2 (wɪ ) 5 (turn on ch.) 1 0,98438145

2 (wɪ ) 1 (don’t know) 0 44,4985958

3 (kw ɪ ) 5 (turn on ch.) 1 0,98438145

3 (kw ɪ ) 1 (don’t know) 0 44,4985958

4 (wɪ k ) 5 (turn on ch.) 1 0,98438145

4 (wɪ k ) 1 (don’t know) 0 44,4985958

The complete set of phonemes comes as input

into every system component. In the reflex voice-

activated control system it is not required:

 Create dictionaries;

 Execute morphological, syntactical, and

semantic analysis of the text;

 Highlight the words and the commands.

The system reacts on audio stream and knows

how to “extract” its informative part (based on the

maximum certainty). The same way as the human

does it.

To test these ideas RVCS was embodied in the

TV voice control system – GUT (Teslia, 2013a).

For describing the procedure of the system's

training it is necessary to emphasize that the system

consists of two separated modules:

 The phoneme-by-phoneme recognizer, which

translates a sound wave into a set of

phonemes;

 The reflex voice control system, which

recognizes a command based on this set.

Both systems are built on principles of

supervised learning, thus two separate training data

sets were used for their learning.

For the training of the phoneme-by-phoneme

recognizer, namely for the acoustic model

construction, we used a set of 500 sound files with

words along with their phonetic transcriptions. The

phonetic transcription was labeled manually. This

set consisted of the most frequently used words for

TV control.

For the training of the RVCS, namely for the

construction of the reflex’ DB, we used a set of 2000

different commands to TV. This training data set is

comprised of sets of phonemes along with the

corresponding reactions of the system. The already

trained phoneme-by-phoneme recognizer for each

command produced each set of phonemes.

The training samples for the reflex system also

had some specific features:

 Spontaneous speech with different levels of

sound and noise was used;

 The sentences were pronounced at a different

pace.

The reliability of the developed system was

evaluated on the control sample, which included 600

commands:

 200 of simple commands;

 200 short sentences (3 to 5 words) that

contained commands;

 200 long sentences (at least 8 words) that also

contained commands.

Experimental results:

 For simple commands – 98% correctly

recognized actions;

 For short sentences – 90%;

 For long sentences – 86%;

 False alarm (reaction to natural background

voice in the room) – 81 occurrences for 1 hour

of work.

Experimental results show good performance of

the system.

4 CONCLUSIONS AND

PROSPECTS FOR FUTURE

RESEARCH

The paper shows that based on the non-force model

of interaction a fundamentally new artificial

intelligence system can be created in many areas of

human activity. The theory of non-force interaction

reveals the root causes and the laws of interaction

including the laws of interaction of the basic

elements of the human brain – neurons. Also it may

give a different view of the known physical laws.

Thus a probabilistic interpretation of mechanical

motion was proposed by Klapchenko and Teslia

(2011).

Reflex voice-activated control system for

technical devices, presented in this article, is a bright

example of practical application of NFI theory for

TheNon-ForceInteractionTheoryforReflexSystemCreationwithApplicationtoTVVoiceControl

295

building of systems that adequately respond to the

information given in natural language. Reflex

system's main advantages are:

Independence from dictionaries as the system

responds to sounds, not words. Therefore, you can

control it by clapping (one, two, etc.), whistling,

slurred speech (for the deaf people), etc. To prove

this the system has been trained to respond properly

to commands in German without any readjustments

and changes in software.

Economy of resources – its database includes

only combinations of phonemes which are met in the

input stream and estimation of their influence on

system's response.

Ease of teaching. The system will be delivered

trained in one or more languages. Without

reconfiguring of the stenographer the user, if

necessary, can further train the system. For training

user utters a phrase, or portrays a sound (it will be

represented with some set of “phonemes”) and

presses the key combination, which is a response to

this command. Repeating of this step but using a

command in other words will lead to better

memorizing. Generally, we do the same while

teaching a child.

Openness. In the case of conflicting instructions,

the system will choose the one with stronger non-

force impact. In the case of commands with

approximately equal impact the system's reaction

will be “I don't know” (see the first row in Table 2.).

The reaction will be the same in the case of an

entirely new command, as a reaction to this

command has not been developed yet.

REFERENCES

Anusuya, M. A. and Katti, S. K. 2009, ‘Speech

Recognition by Machine: A Review’, International

Journal of Computer Science and Information

Security, vol. 6, no. 3, pp. 181-205.

Jyothi, P., Johnson, L., Chelba, C. and Strope B. 2012,

‘Large-scale discriminative language model reranking

for voice-search’, Proceedings of the NAACL-HLT

2012 Workshop: Will We Ever Really Replace the N-

gram Model? On the Future of Language Modeling

for HLT, Association for Computational Linguistics,

Stroudsburg, pp. 41-49.

Klapchenko, V.I. and Teslia, I. 2011, ‘Probabilistic

interpretation of mechanical motion’, Cornell

University Library, [online] Available at:

arxiv.org/pdf/1102.0441 [Accessed: 20 Sep 2013].

Purves, D., Williams, S., White, L. and Mace, A. 2004,

Neuroscience, Sunderland, Mass: Sinauer.

Pylypenko, V. 2007, ‘Extra Large Vocabulary Continuous

Speech Recognition Algorithm based on Information

Retrieval’, Proceedings of the 8th Annual Conference

of the International Speech Communication

Association, Antwerp, pp. 1461-1464.

Pylypenko, V. 2009, ‘Raspoznavanie klyuchevykh slov v

potoke rechi pri pomoshchi foneticheskogo

stenografa’ [Recognition of keywords in the flow of

speech using phoneme-by-phoneme recognizer],

Rechevye tekhnologii, no. 1, pp. 75-79, Russian.

Teslia, I. 2005, Nesilovoe vzaimodeystvie [Non-force

interaction], Kondor, Kiev (Ukraine), Russian.

Teslia, I. 2010, Vvedenie v informatiku prirody

[Introduction to Informatics of Nature], Maklaut, Kiev

(Ukraine), Russian.

Teslia, I. 2013a, Prezentatsiya real'nogo primeneniya TNV

[Presentation of the actual use of the Non-force

interaction theory], [online] Available at:

http://www.youtube.com/watch?v=m_pYXVndpbc

[Accessed: 20 Sep 2013].

Teslia, I. 2013b, ‘Theory of non-violent interaction’,

International Journal “Information Theories and

Applications”, vol. 20, no. 1, pp. 88-99.

ICAART2014-InternationalConferenceonAgentsandArtificialIntelligence

296