The NInA Framework

Using Gesture to Improve Interaction and Collaboration in Geographical

Information Systems

Daniel Y. T. Chino

, Luciana A. S. Romani

, Letricia P. S. Avalhais

Willian D. Oliveira

, Renata R. V. Gonc¸alves

Caetano Traina Jr.

and Agma J. M. Traina

Institute of Mathematics and Computer Science,

University of S

ao Paulo, S

ao Carlos, Brazil

Embrapa Agriculture Informatics, Campinas, Brazil

Cepagri-Unicamp, Campinas, Brazil

Keywords:

Natural User Interface (NUI), GIS, Kinect, NOAA-AVHRR images, Image-based Information Systems.

Abstract:

Nowadays, Geographical Information Systems (GIS) have expanded their functionalities including larger in-

teractive displays exploration of spatiotemporal data with several views. These systems maintain a traditional

navigation method based on keyboard and mouse, interaction devices not well suited for large screens nor

for collaborative work. This paper aims at showing the applicability of new devices to ﬁll the usability gap

for the scenario of large screens presentation, interaction and collaboration. New gesture-based devices have

been proposed and adopted in games and medical applications, for example. This paper presents the NInA

Framework, which allows an integration of natural user interface (NUI) on GIS, with the advantage of being

expandable, as new demands are posed to that systems. The validation process of our NInA Kinect-based

framework was made through user experiments involving specialists and non-specialists in TerrainViewer, a

geographical information system, as well as experts and non-experts in the Kinect technology. The results

showed that a NUI approach demands a short learning time, with just a couple of interactions and instructions,

and the user is ready to go. Moreover, the users demonstrated greater satisfaction, leading to their productivity

improvement.

1 INTRODUCTION

Advances in several knowledge areas, such as elec-

tronics, physics, computer science, mathematics, and

others have led to the increasing volume of data col-

lected and stored that are employed for analysis and

decision making. One of the areas that had beneﬁted

from those advances are the sensor technology, ob-

taining better images from satellites for Earth obser-

vation. Orbital sensors aboard polar and stationary

satellites capture images of the Earth’s surface every

day, generating a huge volume of data in different

spatial and temporal resolutions and in different light

spectra.

Data provided by these sensors can be used in dif-

ferent scientiﬁc areas, such as Geography, Meteorol-

ogy and more recently Agriculture. The large amount

of data and their higher resolution are better viewed

on large screens that allow specialists to spot ﬁne de-

tails with less effort. Therefore, the interaction de-

vices would allow easy and natural ways to display

and to manipulate the views. This is the main motiva-

tion of this paper: to present a framework that allows

the specialists to easily manipulate large amount of

data in big screens, with natural and intuitive interac-

tion devices, as well as to do collaborative work and

analysis.

This paper also presents a case study applied to

a remote sensing images system, which improves the

process of decision making in agrometeorological and

Geographical Information Systems (GIS). Specialists

in remote sensing increasingly depend on computa-

tional systems to deal with satellite images. As a con-

sequence, there is an increasing demand for technol-

ogy (software and hardware) suitable to receive, dis-

tribute and manipulate data recorded by orbital sen-

sors. However, the majority of such technology avail-

able today is intended for commercial use, such as Er-

Y. T. Chino D., A. S. Romani L., P. S. Avalhais L., D. Oliveira W., R. V. Gonçalves R., Traina Jr. C. and J. M. Traina A..

The NInA Framework - Using Gesture to Improve Interaction and Collaboration in Geographical Information Systems.

DOI: 10.5220/0004433700580066

In Proceedings of the 15th International Conference on Enterprise Information Systems (ICEIS-2013), pages 58-66

ISBN: 978-989-8565-61-7

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

das

, Idrisi

and ArcGIS

, which are not open source.

These applications include functionalities to open im-

ages, extract values, visualize layers, create algebra of

maps, project images in 3D among others.

GIS have expanded their functionalities includ-

ing geovisualization that studies interactive display

and exploration of spatiotemporal data (Stannus et al.,

2011). Recently, GIS have become popular espe-

cially due to software such as Google Earth. However,

these systems are based on the same traditional navi-

gation method based on keyboard and mouse. New

gesture-based devices have been recently proposed

and adopted for games and medical areas, for exam-

ple, making the interaction much more practical and

simple.

Basically, the interaction with a software tool is

provided by the keyboard and mouse, which is suit-

able for the majority of interface options. However,

layers in 3D view are not just for display, but are be-

ing used to describe surface aspects and can be ob-

served at oblique angles. Thus, images in 3D views

must be handled in a different manner from the tradi-

tional approach employed for 2D views.

Our focus is on how to take advantage of new

interaction possibilities provided by natural user in-

terfaces, such as gesture and body movement inter-

pretation to improve the user/computer interaction.

New devices, such as the Microsoft Kinect, have been

widely used in games to track the players’ move-

ments, enabling the game to become much more re-

alistic. Additionally, these games allow more than

one player acting together. Natural movements as-

sociated to the collaboration aspects can also strongly

contribute to improve the interaction in systems from

other areas such as geographical information systems

and satellite imagery systems.

In this context, this paper proposes an innova-

tive interaction approach for satellite imagery systems

based on gestures. Our focus is on how to take advan-

tage of new interaction possibilities provided by nat-

ural user interfaces, such as gesture and body move-

ment interpretation to improve the user/computer in-

teraction. Our framework uses the Kinect technol-

ogy to improve the interaction when handling 3D vi-

sualizations. The validation process of our Kinect-

based framework was made through experiments in-

cluding specialists and non-specialists in geographi-

cal information systems, as well as experts and non-

experts in the Kinect device. The experiments results

showed that at the very ﬁrst interactions, beginners

not trained on this device found more difﬁcult to ma-

http://www.erdas.com/

http://www.clarklabs.org/

http://www.arcgis.com

nipulate the interface, but after a while everyone re-

ported this model of interaction much more intuitive

and enjoyable than using the keyboard and mouse.

This paper is organized as follows. In the next

section we present the background needed to follow

the ideas of this paper. The framework proposed is

detailed in Section 3. The user experiments are re-

ported in Section 4 and ﬁnally we conclude the work

in Section 5.

2 BACKGROUND

One of the most troublesome problems to develop a

computer system is the design of its user interface.

This is an important issue, because technologies with

complex interfaces can become useless. Interactive

graphical interfaces offer the advantage of allowing

users with different computer skills to easily interact

with the systems.

Researchers in Human Computer Interaction have

proposed innumerous methods and techniques in or-

der to contribute to the improvement of user interface

design processes. The majority of these methods con-

sider users as the center of the design process. In this

sense, several works have investigated different as-

pects related to human beings such as cognition, per-

ception, motor system, memory, among others.

All of those aspects must be considered, but one

of foremost importance is the time that users take to

complete a task, as shorter times imply more intuitive

designs. According to the Fitts’s law, the time to per-

form a task is proportional to the size and the distance

of the target (Fitts, 1954). Thus, the designer must

know about the capacity of the human’s motor sys-

tem to accurately control movements in order to bet-

ter place the interface elements in the screen (Fitts,

1954). In addition, the learning curve should also be

considered when a new design is proposed to a sys-

tem, taking into account both the beginners and the

experts.

The time to perform a task and the learning curve

become more important when the user interface is

designed to be manipulated through new interaction

modalities such as touch and gesture. When users

employ their hands in three-dimensional spaces to in-

teract with the interface instead of moving a device

we call this type of interaction “gestural user inter-

face”. In gestural interfaces, the learning curve tends

to shrink, since the user reproduces the same move-

ments performed in the real world, thus they probably

spend less time learning new interaction metaphors

(Rizzo et al., 2005; Stannus et al., 2011). Although

gestures are a powerful mode of interaction, cultural

TheNInAFramework-UsingGesturetoImproveInteractionandCollaborationinGeographicalInformationSystems

differences/conventions can make it difﬁcult to inter-

act with such interfaces (Norman, 2010), leading to

an adaptation process.

Gestural user interfaces are called natural user

interfaces (NUI). They are deﬁned as interfaces in

which the users are able to interact with computers

as they interact with the world (Jain et al., 2011). Ac-

cording to Malizia et al. (Malizia and Bellucci, 2012),

the term natural in NUI is mainly used to differentiate

them from classical user interfaces, which use artiﬁ-

cial devices whose operation has to be learned. How-

ever, the current gesture interfaces are based on a set

of gestures that also must be learned. According to

Norman (Norman, 2010), the gestures are a valuable

contribution as interaction technique, but still requires

more research involving the use of this form of inter-

action in different systems, so that we can better un-

derstand how to use them. Moreover, conventions for

standards should be developed, so that the same ges-

tures mean the same things in different systems. In

other words, researchers need to conduct research to

assess the feedback, guides, correction of errors and

other issues related to the use of gestures and the ben-

eﬁts of this form of interaction.

Malizia et al. (Malizia and Bellucci, 2012) afﬁrm

that the way to go for natural interfaces is still very

long, since only using natural gestures are not enough

for computer systems. They are still artiﬁcial because

the designer of the system still requires a set of ges-

tures to be used.

The Microsoft Kinect is one of the recent devices

that aims at supporting gestural interfaces. The Kinect

sensor is available to the community since November

2010. It contains a color sensor, an infrared emit-

ter, an infrared depth sensor and a multi-array mi-

crophone. Combining these sensors, Kinect provides

a stream of color images, depth images and audio.

Through the infrared sensor, Kinect is also able to

track people actions, simultaneously recognizing up

to six users. From these users, up to two users can be

tracked in detail, given spatial information about the

joints of the tracked users over time. The Kinect API

enables the application to access and manipulate the

data collected by the sensor (Microsoft, 2013).

Recently, several works have explored natural user

interfaces in different contexts. Stannus et al. (Stan-

nus et al., 2011) developed a gesture-based system

to the Google Earth application. Boulos et al. (Bou-

los et al., 2011) applied depth sensors such as Mi-

crosoft Kinect and ASUS Xtion to control 3-D vir-

tual globes such as Google Earth (including its Street

Kinect for Windows. Available at http://

www.microsoft.com/en-us/kinectforwindows. Accessed

date January 16, 2013.

View mode), Bing Maps 3D, and NASA World

Wind. Richards-Rissetto et al. (Richards-Rissetto

et al., 2012) explore the Microsoft’s Kinect to create a

system to virtually navigate, through a prototype 3D

GIS, the digitally reconstructed ancient Maya city.

3 OUR PROPOSED

FRAMEWORK

In this section we present the NInA framework (Nat-

ural Interaction for Agrometeorological Data). This

framework provides a complete set of functionalities

for natural interactions on GIS. We also present a case

study named TerrainViewer, a tool to visualize remote

sensing images. After a brief introduction to Terrain-

Viewer, the next sections highlights the NInA gesture

module.

3.1 TerrainViewer

TerrainViewer is a tool capable of presenting agrome-

teorological data through three-dimensional rendered

models including altitude data. This tool combines to-

pographic data with remote sensing images from dif-

ferent sources, as AVHRR/NOAA

and SRTM

im-

ages. The rendered models assist agrometeorologi-

cal specialists in ﬁnding relationships between remote

sensing data and their respective geographic position

and elevation. Thus, specialists can analyze huge

amounts of data in a short period of time and focus

on a region of interest.

The TerrainViewer tool was originally developed

using a simple and robust manipulation scheme,

based on just keyboard and mouse. With these de-

vices it is possible to rotate, translate and scale the

model, meeting the basic specialists needs to perform

image and data analyses. However, the sophisticated

and highly detailed handling of a 3D scenery pressed

for a more ﬂexible and natural interface. The answer

for those needs was the NInA Framework.

3.2 The NInA Framework

Many studies regarding more natural computer in-

terfaces, as well as the increasing number of differ-

ent domain systems using these interfaces, show that

there are effective gains in allowing users to interact

with tools in the most intuitive way. Based on this

Advanced Very High Resolution Radiometer sensors

aboard National Oceanic and Atmospheric Administration

satellites.

NASA’s Shuttle Radar Topography Mission.

ICEIS2013-15thInternationalConferenceonEnterpriseInformationSystems

Input Stream

Video

Audio

TerrainViewer

NInA

Interaction

Recognition Layer

Streaming Processing

Core Layer

Figure 1: NInA framework architecture.

premise, the NInA Framework was developed aimed

at providing natural interactions for GIS.

The NInA framework is a novel approach that

brings the natural interactions advantages for the sev-

eral visual analysis requirements on satellite imagery

systems. It is structured using a two layer architec-

ture. The Streaming Processing Core Layer (SPCL)

receives data from several devices. The data is pro-

cessed and sent to the Interaction Recognition Layer

(IRL), which is responsible to detect the user actions.

The actions generate events that are sent to the appli-

cations, where they will be acknowledge processing

the visualizations. Figure 1 shows an overview of the

NInA framework and its dataﬂow.

The SPCL receives data streams of video, audio

and depth maps that can be captured by several de-

vices, including video cameras, microphones, motion

sensing controllers and multi-touch gadgets. SPCL

was designed as a modular architecture. Each mod-

ule receives a set of raw input streams and outputs the

processed data stream required for the second layer.

For instance, the skeleton module receives raw depth

data and outputs a stream containing 3D positions of

body joints. Eventually, a module may receive the

output of another module. A summarization of all

modules is shown in Table 1.

The ﬁrst module, Audio, has been designed to pro-

cess input audio stream data, i.e., ﬁltering noise and

detecting speech directly from the frequency domain.

In this module, text stream is also extracted for further

use on IRL. The Depth module is used to compute a

matrix with depth values from the signals provided by

the infrared (IR) sensor. In the Hand module, video

and depth data are used to segment user’s hands posi-

tion and contour. The segmentation is used to detect

the ﬁngers’ positions and to determine the hand state

(opened/closed). With the depth values, the module

Skeleton is able to compute the 3D position of user’s

body joints.

The current version of NInA focuses on the Kinect

device, because its range of sensors can provide a

pretty complete experience on natural interaction.

The Kinect streams are received by SPCL using Mi-

crosoft Kinect for Windows API and drivers and the

SPCL modules process the streams aided by Kinect

API. However, the SPCL modules improve their out-

puts with focus in GIS and produce other outputs that

Kinect API doesn’t provide, such as ﬁngers positions,

which are very useful on natural interaction (gesture)

over maps.

Table 1: Requirements and outputs of the SPCL modules.

Module Input Data Output Stream

Audio – Audio Stream

– Audio

–Text

Depth – IR signals – Depth Matrix

Hand

– Depth Matrix

– Fingers Position

– Video Stream

– Hand State

(opened/closed)

Skeleton – Depth Matrix

– 3D Position of

Body Joints

On the next step, the SPCL output is sent as input

stream to the IRL, where it is processed into an inter-

active control scheme. Table 3 shows the input and

output event of each module in IRL. In the current

version there are two main modules in IRL, Gesture

and Voice Control. The Gesture module receives the

output of the Hand and Skeleton modules of the pre-

vious layer and triggers the detected events. All de-

tected events are processed, as the visualization target

is a real object in front of the user. The voice con-

trol receives only the Audio module output and the

triggered event brings the detected voice as audio and

text to be used in the action decision.

In both modules we manipulate the input streams

using a time-based sliding window approach. The

capture rate is 30 frames per second and the window

size is 15 frames. For each window slide, the mod-

ules process their input and detect a speciﬁc action,

triggering their respective events.

Figure 2 presents the ﬂowchart of how the Ges-

ture module triggers events. The IRL receives video,

depth and skeleton streams from the Kinect device,

combining them to detect the main user by choosing

TheNInAFramework-UsingGesturetoImproveInteractionandCollaborationinGeographicalInformationSystems

Figure 2: Processing streams and events in the Gesture module from NInA.

Table 2: Requirements and outputs of the IRL modules.

Module SPCL Output Event

Gesture

– Hand

– Stretch/Compress

– Skeleton

Object

– Closed Hand

Movement

– Opened Hand

Movement

– Hand Up

Voice Control – Audio – Voice Command

the nearest one. With the main user chosen, the skele-

ton positions, the hand status and ﬁngers positions are

computed.

The Skeleton and Hand streams are sent as input

to the Window Handler in the Gesture module in the

IRL. The Window Handler receives the stream un-

til the sliding window is full. Thereafter, it is pro-

cessed generating the IRL output events, e.g., if the

user moves both opened hands before his/her body,

the Gesture module triggers a Stretch/Compress Ob-

ject event.

3.3 TerrainViewer and NInA

Since the release of Kinect, TerrainViewer users were

enthusiastic with the new possibilities that this tech-

nology could provide, based on the later results pre-

sented in the literature about NUI. The NInA frame-

work has been developed with an incremental ap-

proach, based on the requests of specialists over the

TerrainViewer functionalities. This section contains

the explanations about how NInA was built to meet

the proposed requirements and these natural interac-

tions.

The set of actions was deﬁned by the specialists

and their natural gestures were chosen between the

proposed events for the NInA Framework. Table 3

presents the association between actions and gestures

that results in the NUI version of TerrainViewer.

Table 3: Gestures and the respectives actions on NInA

Framework + TerrainViewer.

Actions Gestures

Zoom In/Out

– Stretch/Compress

Object

Rotation

– Closed Hand

Movement

Translation

– Opened Hand

Movement

Initial State – Hand Up

ICEIS2013-15thInternationalConferenceonEnterpriseInformationSystems

Figure 3: Scaling a map visualization using the NInA

Framework.

All gestures were implemented considering the

rendered model being shown in front of the user. For

instance, if the user puts both hands in front of his/her

body and move away one hand from the other, which

describes the stretching object gesture, the Stretch

Object gesture is triggered and the model view zooms

in as the result action. As the model gets larger, the

user feels like he/she is stretching the model, as shown

in Figure 3. The zoom out action is analogous, and it

is recognized by the Compress Object gesture when

both hands move toward each other. Another gesture

recognized by the NInA framework is when the user

moves one closed hand in front of his/her body. With

this gesture, the user feels like he/she is grabbing one

part of the rendered model and changing its angula-

tion, as shown in Figure 4. This results in the rotation

action on the model being displayed. The same ges-

ture executed with an opened hand results in the trans-

lation action, where the position of the model changes

depending on the hand position (Figure 5). Finally,

although it is not quite a natural interaction, by rais-

ing one or both hands above his/her head (Hand Up

gesture) for at least two seconds the user resets the

visualization to its initial state.

It is important to highlight that all the interaction

movements provided by NinA are the ones a user does

when pointing to an image during discussion or ana-

lyzes in a collaborative work. Moreover, when the

user is standing up for a plenary presentation employ-

ing large screens, the use of this kind of interaction

makes the whole process natural, simple and easy.

Figure 4: Rotation of a map visualization using the NInA

Framework.

Figure 5: Translation of a map visualization using the NInA

Framework.

4 ANALISYS WITH USERS

In order to evaluate the NInA framework incorpo-

rated in the TerrainViewer system, we conducted a

user experiment involving six subjects with different

proﬁles. Although the number of users in the test had

been small, they were a representative part of poten-

tial users. Furthermore, the GIS application involves

TheNInAFramework-UsingGesturetoImproveInteractionandCollaborationinGeographicalInformationSystems

a speciﬁc group of users becoming more difﬁcult to

ﬁnd an expressive number of testers to be involved

in the experiment. The main objectives of this user-

based experiment aimed at assessing:

1. facilities and difﬁculties of using Kinect-based in-

teraction;

2. users’ satisfaction;

3. robustness and performance of the system;

4. usability of the system.

The group of participants was composed of spe-

cialits in cartography, forest engineering, agricultural

engineering and computer engineering with different

knowledge about GIS (experts and non-experts). Ac-

cording to their levels of ability in the manipulation

of the Kinect device, the users were classiﬁed as:

a) beginners: two of them had never used it before;

b) intermediaries: two users had used it once or

twice;

c) upper-intermediary: one user had already used it

with games more than 10 hours;

d) expert: one of them had used it with games more

than 100 hours.

The user test was conducted in a usability labo-

ratory, where the Kinect sensor was installed and the

system was projected in a large screen as showed in

Figure 6. Although the best way of observing users

was in their own work environment, we were not able

to install the Kinect sensor in the users’ ofﬁce because

of the distance required to perform the test (more than

one meter from the device).

Figure 6: Devices used in the users’ test.

The experiment was registered by two cameras,

one positioned in front of the users and another by

their side to capture users’ facial expressions as well

as their gestures during each interaction (Figure 7).

Audio was also recorded to allow assessing expres-

sions, interjections and comments during the experi-

ence. Also, two observers took notes during the test.

Figure 7: User’s interaction recorded by cameras as well

as the user’s representation and the Terrain Viewer system

projected in the screen.

The whole test took around four hours and was

organized in four different steps:

1. Interaction through Kinect. First, the users were

invited to freely explore the system using NInA

for some minutes. After that, one observer

asked them to perform three tasks with the 3D

satellite image presented in the screen: rota-

tion/translation, zoom in/out and restart the map

to the initial position. No instruction was provided

to users during this step. In the end, one of the

observers explained all commands to perform the

suggested tasks. Then, all users redid the tasks.

2. Interaction using Mouse. After all users ﬁnished

the tests using Kinect with NInA, the observer in-

vited them to manipulate the TerrainViewer sys-

tem through mouse. First they could explore the

system without previously deﬁned tasks. After a

few minutes, users should execute the same se-

quence of tasks performed with Kinect and NInA.

3. Answering a Survey. After their interaction with

the Kinect/NInA and keyboard/mouse, the users

reported their experience in a survey about

difﬁculties/facility met during the interaction,

satisfaction / frustration, and adoption of the

Kinect/NInA technology.

4. Brainstorm and Discussion. When all users have

ﬁnished the test, they have participated in a brain-

storming session with two observers. They talked

about their impressions, doubts and commands to

interact with the system.

As expected, during the exploratory phase, the be-

ginners demonstrated more difﬁculties to interact with

the Kinect sensor than experts. They moved their

ICEIS2013-15thInternationalConferenceonEnterpriseInformationSystems

arms and hands without discretion, with broad and

quick movements. Moreover, they also moved along

the body without focusing the sensor. The hands were

pointed to the screen instead of being directed to the

sensor. Although they initially could not totally con-

trol the object, their expression was relaxed, enjoying

with the activity, regardless of they were making mis-

takes or not. They laughed, made funny comments,

as when the map screen became out of focus and they

joked saying “come on, come on!”.

Only one of them was frustrated when was unable

to zoom out the map saying, “Why does it not lis-

ten to me? And now, what do I do?” (talking to the

map that had disappeared with too much zoom from

the screen). The less experienced users not positioned

the body in front of the sensor to be recognized nei-

ther looked at windows with the skeleton of the body

drawn. The focus of their attention was just over the

display system.

On the contrary, the experts positioned their hands

in front of their body, expecting to be recognized by

the sensor. They made slow and precise movements.

Their face demonstrated concentration attempting to

associate the system response to each gesture. In

general, they spoke little throughout the exploratory

phase. The focus of the experts was not just on the

system’s windows, but also on what the sensor was

picking up. Their looks went from one window to an-

other in the screen trying to capture every movement

as well as the system responses.

During the next phases, in which users should

perform a sequence of tasks, their attention became

greater. All of them tried to accomplish the tasks

in the best way, but they had difﬁculties with the

zoom task. All of them successfully rotated the ob-

ject. However, only the most experienced managed

to zoom in, and only two were able to zoom out the

object. The last task corresponding to place the map

back to its initial state has not been satisfactorily per-

formed by any of the subjects. Several users have

commented on the difﬁculty with the zoom out con-

sidering that the problem occurred due to the distance

from the sensor, the delay in response or system cali-

bration for closed hand movements.

After the ﬁrst two steps (free exploration and se-

quence of tasks), the developer showed to all subjects

which were the movements programmed to be per-

formed in each activity. After the explanations, ev-

eryone wanted to repeat the script to make sure they

would be able to perform all tasks. Many of them

agreed that it seemed much easier after the tips, espe-

cially the command to return to the initial state, rais-

ing one or both hands. Several users commented that

this last task (back to initial state) was the most fun

and were raising their arms at various times laughing.

The next step on the achievement of the same se-

quence of tasks by using the mouse was performed

satisfactorily for all. Although the activity with the

mouse seemed simple, they also showed no enthusi-

asm.

Apart from the cameras that captured users’ ges-

tures and speech, observers took notes and users an-

swered a questionnaire with the following questions:

• What is your academic formation?

• What is your level of experience with Kinect?

• What was the difﬁculty level with the system em-

ploying Kinect/NInA?

• Is the use of the system intuitive?

• What was the level of satisfaction? The experi-

ence was fun?

• Would you adopt this technology for interacting

with a GIS?

The answers are summarized in Table 4. Accord-

ing to the answers, also conﬁrmed by the recordings,

they had more initial difﬁculties by not knowing the

new interaction technology (Kinect). Although one of

the users is an expert in GIS, as this user was not fa-

miliar with the gestural interaction, the performance

of activities was impacted, which did not occur when

using the mouse, because of previous training in us-

ing it. However, after receiving instructions from the

system developer, the user interacted with the system

without difﬁculty. This allows us to acknowledge the

need of presenting basic instructions on the screen

(similar to those presented in various videogames)

and/or training user before the interaction. However,

it was conﬁrmed that, the learning time is very short,

what is a plus of this technique.

Despite some initial difﬁculties, only one user felt

that this form of interaction is non-intuitive. On the

other hand, all users stated that the hand movement

approaching or moving away to do the zoom com-

mand was the ﬁrst movement that occurred to them.

Some of them commented that the scroll ball on the

mouse to accomplish the same task is far less intu-

itive, for example.

The degree of satisfaction was high as we could

see by the expressions of each one during the experi-

ment. They smiled, made jokes, grimaced when wan-

dered or were unsure about what to do, commented

something funny and no one wanted to stop using the

system. When asked about the possibility of adoption

of the technology in his/her work, all subjects were in

favour of it.

During the brainstorming session, each user had

the opportunity to report their experiences and feel-

ings beyond making suggestions. They commented

TheNInAFramework-UsingGesturetoImproveInteractionandCollaborationinGeographicalInformationSystems

Table 4: Information about users and their answers of the questionnaire applied to them after the interaction through Kinect.

User Profession Kinect experience Difﬁculty Intuitive Satisfaction Adopt

degree technology?

User1 Cartography beginner high yes high yes

User2 Forest engineering intermediary medium yes high yes

User3 Agricultural engineering upper-intermediary low yes high yes

User4 Agricultural engineering beginner high yes high yes

User5 Computer engineering intermediary medium no high yes

User6 Computer engineering expert low yes high yes

on the difﬁculty of making subtle movements to zoom

and asked if this command could be calibrated less

rigid. Several of the subjects suggested different

movements to rotate the object instead of making a

ﬁst. They requested a movement to stop interacting

with the possibility of returning later. They com-

mented that it would be very interesting to use the

tool for collaborative use, which is hard to do with the

mouse. Furthermore, they suggested that the system

with this type of interaction would be appropriate for

use in teaching and presentations, since the gestures

are natural movements that they usually do.

5 CONCLUSIONS

This paper presented the NInA framework aimed at

providing resources to easily add a natural user inter-

face (NUI) based on the Kinect technology on geo-

graphical information systems. The modules of NInA

make it a simple, but powerful basis to build natural

interfaces based on gestures, with the advantage of

being expandable, as new needs arrive at the system,

and is easily portable to other devices.

The need of NUI became more urgent as the vol-

ume of data to be analyzed as well as the size of the

maps and presentations grow, demanding the use of

very large displays to allow comfortable interaction.

Therefore, the use of mouse and keyboard was clearly

regarded as being not convenient anymore, since they

limit the users’ movements. In this scenario, NInA

is the natural choice, since the user is not limited by

holding a device. Moreover, when the analysis in-

volves collaborative work, the other analysts easily

recognize the users’ movements, and can participate

as a group.

The analyses performed with the users, in the case

study with the TerrainViewer system, showed that the

learning time is quite short, just a couple of interac-

tions and instructions and the user is ready to go. Be-

yond that, the users felt greater satisfaction employing

the Kinect/NInA framework to interact with the sys-

tem, what can lead to improve their productivity.

ACKNOWLEDGEMENTS

The authors are grateful for the ﬁnancial support

granted by CNPq, CAPES, FAPESP, Microsoft Re-

search, Embrapa Agricultural Informatics, Cepa-

gri/Unicamp and Agritempo for data.

REFERENCES

Boulos, M. N. K., Blanchard, B. J., Walker, C., Montero, J.,

Tripathy, A., and Gutierrez-Osuna, R. (2011). Web gis

in practice x: a microsoft kinect natural user interface

for google earth navigation. International Journal of

Health Geographics, 10(45):1–14.

Fitts, P. M. (1954). The information capacity of the human

motor system in controlling the amplitude of move-

ment. Journal of Experimental Psychology, 47:381–

391.

Jain, J., Lund, A., and Wixon, D. (2011). The future of nat-

ural user interfaces. In Proceedings of the 2011 an-

nual conference extended abstracts on Human factors

in computing systems, CHI EA ’11, pages 211–214,

New York, NY, USA. ACM.

Malizia, A. and Bellucci, A. (2012). The artiﬁciality of

natural user interfaces. Communications of the ACM,

55(3):36–38.

Microsoft (2013). Kinect for windows sdk. Avail-

able at http://msdn.microsoft.com/en-us/library/

hh855347.aspx. Accessed date January 16, 2013.

Norman, D. A. (2010). Natural user interfaces are not natu-

ral. Interactions, 17(3):6–10.

Richards-Rissetto, H., Remondino, F., Agugiaro, G.,

Robertsson, J., von Schwerin, J., and Girardi, G.

(2012). Kinect and 3d gis in archaeology. In Pro-

ceedings of 18th International Conference on Virtual

Systems and Multimedia (VSMM’12), pages 331–337.

IEEE Computer Society.

Rizzo, A., Kim, G. J., Yeh, S.-C., Thiebaux, M., Hwang,

J., and Buckwalter, J. G. (2005). Development of a

benchmarking scenario for testing 3D user interface

devices and interaction methods. In Proceedings of

the 11th International Conference on Human Com-

puter Interaction, Las Vegas, NV.

Stannus, S., Rolf, D., Lucieer, A., and Chinthammit, W.

(2011). Gestural navigation in google earth. In

Proceedings of the 23rd Australian Computer-Human

Interaction Conference, pages 269–272, New York,

USA. ACM.

ICEIS2013-15thInternationalConferenceonEnterpriseInformationSystems