Knowledge Engineering and Ontologies for Object Manipulation in

Collaborative Virtual Reality

Manolya Kavakli

Visor (Virtual and Interactive Simulation of Reality) Research Group, Virtual Reality Lab,

Department of Computing, Macquarie University, North Ryde, Sydney, Australia

Keywords: Collaborative Engineering, Virtual Reality, Object Manipulation.

Abstract: This project describes an ontology for a collaborative engineering task. The task is to take apart an

interactive 3D model in 3D space using virtual reality and to manipulate an object. The project examines a

virtual environment in which two engineers can perform a number of tasks for manipulating object parts

controlling a wiimote inside an immersive projection system. The interface recognizes hand-gestures of the

engineers, pass commands to a VR modelling package via a gesture recognition system, perform the actions

on the 3D model of the object, generating it on the immersive projection screen. We use retrospective

protocol analysis for knowledge engineering and ontology building analysing the cognitive processes.

1 INTRODUCTION

“A major problem in the design and application of

intelligent systems is to capture and understand: the

data and information model that describes the

domain; the various levels of knowledge associated

with problem solving; and the patterns of

interaction, information and data flow in the

problem solving space. Domain ontologies facilitate

sharing and re-use of data and knowledge between

distributed collaborating systems.” (Ugwu et al,

2001). We need ontologies for the following

reasons:

 To have shared understanding of the topic

 To enable reuse of domain knowledge

 To make domain assumptions explicit

 To analyze domain knowledge

Ontologies have become core components of many

large applications yet the development of

applications has not kept pace with the growing

interest (Noy and McGuinness, 2001). This paper

describes an ontology for collaborative engineering

platforms using virtual reality and knowledge

acquisition techniques. The paper shows that a

common ontology facilitates interaction and

negotiation between engineers (agents) and other

distributed systems. The paper discusses the findings

from the knowledge acquisition, their implications in

the design and implementation of collaborative

virtual reality systems, and gives recommendations

on developing systems for collaborative design and

object manipulation in engineering sector.

2 OBJECT MANIPULATION IN

VIRTUAL REALITY

The first effort on object manipulation can be traced

back to late 70’s. Parent (1977) proposed a system

which was capable of sculpting 3D-data. The

significant problem solved within the system was

hidden-line elimination by choosing planar

polyhedral representation. Parry (1986) developed a

system using constructive solid geometry (CSG) that

can only carry out a number of simple sculpting

tasks using traditional devices such as mice and

keyboards as input medium. Coquillart (1990)

developed a sculpting system using 3D free-form

deformation which was more capable of generating

arbitrarily shaped objects in comparison to Parry’s

system. Mizuno et al (1999) built a system for

virtual woodblock printing by carving a workpiece

in the virtual world using CSG. Recent

developments in VR led to a number of important

innovations. Pederson (2000) proposed Magic

Touch as a natural user interface that consists of an

office environment containing tagged artifacts and

284

Kavakli M..

Knowledge Engineering and Ontologies for Object Manipulation in Collaborative Virtual Reality.

DOI: 10.5220/0004170202840289

In Proceedings of the International Conference on Knowledge Engineering and Ontology Development (KEOD-2012), pages 284-289

ISBN: 978-989-8565-30-3

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

wearable wireless tag readers placed on the user’s

hands. Bowman and Bilinghurst (2002) attempted to

develop a 3D sketchpad for architects. However, the

3D interface with menus did not respond to the

expectations of architects, and there was a need for a

greater understanding of users’ perceptions and

abilities in 3D interface development. Salomon

(2005) introduced non-uniform rational b-spline

(NURB) deformation to Lau’s Vsculpt system

(2003) with the integration of CyberGloves. In this

system, the users could generate arbitrary-shaped

objects by manipulating a number of control points

that required the users to learn the parametric control

techniques. Jagnow and Dorsey (2006) applied

haptic displacement maps to process the graphics

data in an efficient manner in a virtual sculpting

system. In this system, models could be described by

a series of partitioned local slabs, each representing

a vector field. However, haptic displacement map

could not be applied to the dynamic scenes that

change frequently. In spite of these innovations,

there are still a number of core questions waiting to

be answered. These are as follows:

 How can we develop a robust method for object

manipulation, configuring complex engineering and

design systems by using VR technology?

 How can we support the communication between

geographically separated engineers and the

CAD/CAM model of the product?

2.1 Collaborative Engineering in VR

In this project, we have developed a collaborative

engineering platform to investigate the nature of

shared information. The project examines a virtual

environment in which two engineers can perform a

number of tasks for manipulating object parts

controlling a wiimote inside an immersive projection

system (Figure 2). The engineers wearing

stereoscopic goggles have the benefit of being able

to work with a stereo image. The interface

recognizes hand-gestures of the engineers, pass

commands to a VR modelling package via a gesture

recognition system, perform the actions on the 3D

model of the object, generating it on the immersive

projection screen. Wiimote (Wii Remote) is the

primary controller for Nintendo's Wii console. The

main feature of the Wii Remote is its motion sensing

capability, which allows the user to interact with and

manipulate items on screen via movement and

pointing through the use of accelerometer and

optical sensor technology.

The expected outcomes of this study are:

 Novel human computer interaction techniques;

and

 Ontologies demonstrating the structure of

cognitive actions of engineers in object

manipulation.

Figure 1: Collaboration in Co-DeSIGN.

2.2 Requirements Analysis

The objective of this project is to design and build a

collaborative platform (Co-DeSIGN) for

disassembling a mechanical product using Virtual

Reality technology. Each task the mechanical

engineers to perform using this collaborative

platform refers to a module in the system

architecture. The main tasks and modules are

specified as follows:

2.2.1 Explore and Navigate

This module manages the exploration and navigation

in the virtual world. The user is expected to explore

a 3D object and move around it. The user must

control a cursor to perform different actions to

complete the task. The actions are as follows:

 Visualisation: The user must see, perceive, and

investigate the product. We must create a point of

view to represent the sight of the engineer.

 Navigation: While the user is able to move

around the product, he must be able to zoom in/out,

rotate, and translate his point of view.

 Interaction: The user must be able to control the

cursor using wiimote. The user must have a control

over the depth of the cursor, getting closer or far

from the product. The icon of the cursor must reflect

the changes depending on the action performed by

the user.

2.2.2 Disassemble

This module manages the disassembling process.

We have categorised all of the actions the user needs

to perform in order to disassemble the given

KnowledgeEngineeringandOntologiesforObjectManipulationinCollaborativeVirtualReality

285

mechanical product. It is obvious that in reality the

number of possibilities and tools a technician can

use is quasi infinite. In our application, we need

metaphors to realize a subgroup of these actions and

possibilities. In future, we may be able to simulate

more actions by adding new modules to this

application. We hope to be able to integrate a variety

of mechanical links between various parts of the

product. The actions relevant to this module are

specified as follows:

 Movement: The user must be able to move

various parts of the product.

 Selection: The user needs to select various parts.

 Integration or Disintegration: The user must be

able to perform specific actions to the selected parts,

such as mounting parts with a screwdriver or to take

apart.

 Collision detection: The application must

manage the collision between various parts.

 Logical decision-making: The application must

manage the disassembling scenario with a logic

engine. For example, the user is not allowed to

perform any tasks at anytime in the scenario (he may

need to remove the base first, to disassemble the

parts above).

 Position handling: Finally, the application needs

to manage various states of the parts and know their

positions as well as where they belong to as the part

of the product.

2.2.3 Collaborate

This module manages the collaboration of the users

with each other. The engineers must be able to

collaborate to perform the tasks together. This

involves not only the communication with each

other, but also following the partner’s task

performance. Some tasks may require the

performance of a specific action at the same time to

complete it. The number of different tasks the team

must perform together depends on the chosen

scenario. We have limited the number of users of

this collaborative platform with two. Therefore there

is no use in building a complex network to manage

the collaboration in our case. A simple network

where an engineer can communicate with another

without a central server is sufficient.

We defined the actions for the collaborative

process must as follows:

 Processing the State of Actions: The application

must keep a record of actions of both users, perform

these actions subsequently on the product, and

inform the users about what the other is doing or has

done in a timely manner.

 Speech Processing: The application must allow

both users to speak with each other.

 Task Processing: Specific tasks must be

completed only if both users have performed the

right actions at the right time.

3 EXPERIMENTATION

System set up includes two immersive projection

systems and two wiimotes. The 3D model is

generated in Catia 3D modelling software, and

transferred to Vizard virtual environment. Following

this, we conduct pilot studies to test the system.

In this application the goal to be reached by the

team is not to disassemble a new product, but to

disassemble a well-known product the fastest and

the most efficient way possible. The process that

leads to task completion depends on the product the

team must disassemble. There are instructions to

follow and there is often a unique way to

disintegrate a product. Users must follow a specific

order. The assembly given to the user is composed

of a base with two slide rails and a moving base

beard by two bearings as shown in Figure 2. We

conducted pilot experiments and have had 2

engineers to test the system. They have used the

interface (Co-DeSIGN) to disassemble a product in

VR collaborating at a distance.

In future, we plan to use 20 mechanical

engineers to test the system in ENSAM, France and

at the VR LAB, Australia. All participants will be

videotaped, while performing the task in a design

session of 15 minutes in duration. Having completed

the task, the model disassembled by the engineers is

displayed and participants are viewed to video

records of their own engineering session. Then, we

ask them to interpret the reasoning of their hand

gestures, speeches, and motor actions. We also give

them a questionnaire to assess the quality of the

system they have used in comparison to the

traditional methods.

Thus we collect the Retrospective engineering

protocols. We plan to use matrix analytic methods to

give a probability distribution of paths of

consecutive actions in cognitive processes. To

specify usability requirements in system

development, it is important to understand how

humans perceive the world, how they store and

process information, how they solve problems, and

how they physically manipulate objects. We use the

Task Analysis method (the study of the way people

perform tasks with existing systems) to model the

KEOD2012-InternationalConferenceonKnowledgeEngineeringandOntologyDevelopment

286

system. This involves not only a hierarchy of tasks

and subtasks, but also a plan that consists of the

order and conditions to perform subtasks.

Knowledge-based task analysis includes building

taxonomies of objects and actions involved.

Figure 2: The assembly used in the experiments.

4 METHODOLOGY

The methodology is based on the simulation of

cognitive processes in object visualisation, drawing,

and manipulation. Kavakli et al. 1998 conducted a

series of experiments on artists’ free hand sketching

and found that objects are drawn 90% part by part.

Later, they explored the nature of the design process

(Kavakli et al., 1999) and found that there was

evidence for the coexistence of certain groups of

cognitive actions in sketching (Kavakli and Gero,

2001a), which resembles mental imagery processing.

Investigating the concurrent cognitive actions in

designers, they found that the expert's cognitive

actions are well organized and clearly structured,

while the novice's cognitive performance has been

divided into many groups of concurrent actions

(Kavakli and Gero, 2001b, 2002, 2003). This

structural organisation can be exploited to model an

intelligent system to be used for object

manipulation, especially in teams involving novice

and expert engineers. VR technology in this paper

refers to the interface that enables the user to interact

with a VE. It includes computer hardware in the

form of peripherals such as visual display and

interaction devices used to create and maintain a 3D

VE. A VR interface provides immersion, navigation,

and interaction. The project defined in this proposal

examines a VE in which an engineer can manipulate

the parts of a 3D object using a pointer, motion

trackers, and stereoscopic goggles. As stated by

Kjeldsen (1997), hand gestures occur in space, rather

than on a surface, consequently positioning is

inherently 3D. This can obviously be an advantage

when developing gesture-based object manipulation

systems. The usability of 3D interaction techniques

depend upon both the interface software and the

physical devices used. However, little research has

addressed the issue of mapping 3D input devices to

interaction techniques and applications.

Our approach is to investigate 3D object

manipulation in collaborative engineering, hand

gestures and design protocols as the language of the

engineering process. We focus on the structures in

visual cognition and explore bases for rudimentary

cognitive processes to integrate them into an

intelligent VR system. The results provided by

protocol analysis studies are used to construct a user

interface for both visual cognition and hand-gesture

recognition. Retrospective protocol analysis is

influential in understanding visual cognition in the

engineering process.

4.1 Retrospective Protocol Analysis

Retrospective Protocol Analysis involves following

stages:

 Identifying the part–based structure of the

object: The completed model is decomposed into

parts to be used as a reference for the coding of

related cognitive actions.

 Interpretation of video protocols: We transcribe

the verbal protocols of designers from video records

for the analysis of engineering protocols.

 Segmentation of design protocols: Transcribed

engineering protocols are divided into segments. A

cognitive segment consists of cognitive actions that

appear to occur simultaneously.



Coding: We code cognitive actions of designers

using a coding scheme developed by Suwa et al

(1998). In the coding scheme, the contents of what

engineers see, attend to, and think of are classified

into four information categories, namely: depicted

elements, their perceptual features and spatial

relations, functional thoughts, and knowledge. There

are four modes of cognitive actions (Kavakli et al.,

1999): physical (drawing actions, moves, looking

actions), perceptual, functional, and conceptual

(goals). Each mode has a number of subgroups.

In the sample (Figure3), the goals of bisecting the

building and splitting the space, triggers a number of

perceptual actions driven by drawing a circle (Dc:

create a new depiction). Perceptual actions about

Attention to relations between the object features

(Prn1 and Prn2: create or attend to a new relation)

KnowledgeEngineeringandOntologiesforObjectManipulationinCollaborativeVirtualReality

287

are dependent on Drawing a circle (Dc) and Looking

at (L1) previously drawn depictions (line 67). One of

these perceptual actions (Prn1) triggers the

Discovery of a space (Psg: discover a space as a

ground). We will particularly focus on correlations

between the cognitive actions coded "Dc, L, Prp,

Prn, Fo, Fn" as the path to discoveries, based on

Kavakli and Gero (2002). Our task is to mainly

focus on motion tracking, as well as the relationship

between the physical (especially moves) and

perceptual actions. We improve the category of

physical actions (moves) in the existing coding

scheme.

Figure 3: Coded cognitive segment.

4.2 Ontology Development

The Artificial-Intelligence literature contains many

definitions of an ontology; many of these contradict

one another. In this paper, similar to Noy and

McGuinness (2001), we consider an ontology as a

formal explicit description of concepts in a domain

of discourse (classes (sometimes called concepts)),

properties of each concept describing various

features and attributes of the concept (slots

(sometimes called roles or properties)), and

restrictions on slots (facets (sometimes called role

restrictions)). An ontology together with a set of

individual instances of classes constitutes a

knowledge base. In reality, there is a fine line where

the ontology ends and the knowledge base begins.

In this project, our aim is to lay fundamentals for an

ontology development for gesture recognition

systems to be used by an intelligent user interface.

Currently, we are working on the development of an

ontology for gestures. We need to account for a

wide-range of physical actions (hand gestures) as

described by Mulder (1996):

 Goal directed manipulation: Changing position

(lift, move, heave, raise, etc.), Changing orientation

(turn, spin, rotate, revolve, twist), Changing shape

(mold, squeeze, pinch, etc.), Contact with the object

(grasp, seize, grab, etc.), Joining objects (tie, pinion,

nail, etc.).

 Indirect manipulation: (Whet, set, strop)

 Empty-handed gestures: (twiddle, wave, snap,

point, hand over, give, take, urge, etc.)

 Haptic exploration: (touch, stroke, strum, thrum,

twang, knock, throb, tickle, etc.)

In the design of a hand-gesture based interface, we

plan to address the following issues (Kjeldsen,1997):

object selection, action selection (pose and position,

pose and motion, multiple pose), action modifiers

and rhythm of interaction (syntax of hand gestures).

We will explore the semantics of pause (action stops

then continues), comma (action completed and

repeated) and retraction (another action). Assuming

that hand gestures generally have a Prepare-Stroke-

Retract cycle, we develop a vocabulary of hand

gestures such as:

Prepare/Pose/Pause/Select/Retract,

Prepare/Pose/Comma/Pose/Stroke/Retract.

The following syntax may be used to address a

hand-gesture interface and phrase can be further

decomposed to implement the issues described

above:

Gesture-> Prepare <Stroke> Retract

Stroke -> [Phrase Comma]* Phrase

Phrase -> [Pose|Motion Pause]* Pose|Motion

In this paper, we discuss general issues to consider

and offer one possible process for developing an

ontology. We describe an iterative approach to

ontology development: we start with a rough first

pass at the ontology, and then revise and refine the

evolving ontology and fill in the details.

5 CONCLUSIONS

In this project, we build a hybrid reality system,

where the user’s hands form dynamic input devices

that can interact with the virtual 3D models of

objects in a Virtual Environment (VE). In the current

phase, we have been trying to complete the gesture

ontologies to feed the gesture recognition system

and then we will start the experimentation with a

large number of participants. As stated by McNeill

(2006), gestures can be conceptualized as objects of

cognitive inhabitance and as agents of social

KEOD2012-InternationalConferenceonKnowledgeEngineeringandOntologyDevelopment

288

interaction. Inhabitance seems utterly beyond

current modelling, but an agent of interaction may

be modelable. Coordinative structures in

collaborative engineering may help explain the

essential duality of language which is at present

impossible to model by a computational system.

ACKNOWLEDGEMENTS

This project has been sponsored by the Australian

Research Council Discovery grant DP0988088 to

Kavakli, titled “A Gesture-Based Interface for

Designing in Virtual Reality”, and by two French

government scholarships given to Stephane Piang-

Song (2008) and Joris Boulloud (2009) to complete

their internships towards the degree of MEng at the

Department of Computing, Macquarie University.

REFERENCES

Bowman, D. and Billinghurst, M. 2002. Special Issue on

3D Interaction in Virtual and Mixed Realities: Guest

Editors' Introduction. Virtual Reality, vol. 6, no.3, pp.

105-106.

Coquillart, S., 1990. Extended Free-Form Deformation: A

sculpturing tool for 3D geometric modeling. In

SIGGRAPH ’90, vol 24, number 4, pages 187-196,

ACM, August.

Jagnow, R., Dorsey, J., 2006: Virtual Sculpting with

Haptic Displacement Maps. robjagnow.com/research/

HapticSculpting.pdf, last accessed: 10th Jun 2006.

Kavakli, M., Scrivener, S.A.R, Ball, L.J., 1998. The

Structure of Sketching Behaviour, Design Studies, 19,

485-517.

Kavakli, M., Suwa, M., Gero, J. S., Purcell, T., 1999.

Sketching interpretation in novice and expert

designers, in Gero, J.S., and Tversky, B. (eds), Visual

and Spatial Reasoning in Design, University of

Sydney, Sydney, 209-219.

Kavakli, M., Gero, J. S., 2001a. Sketching as mental

imagery processing, Design Studies, Vol 22/4, 347-

364.

Kavakli, M., Gero, J. S., 2001b. Strategic Knowledge

Differences between an Expert and a Novice, Preprints

of the 3rd Int. Workshop on Strategic Knowledge and

Concept Formation, University of Sydney, 44-68.

Kavakli, M., Gero, J. S., 2002. Structure of Concurrent

Cognitive actions: A Case Study on Novice & Expert

Designers, Design Studies, Vol 23/1, 25-40.

Kavakli, M., Gero, J. S., 2003. Difference between expert

and novice designers: an experimental study, in U

Lindemann et al (eds), Human Behaviour in Design,

Springer, pp 42-51.

Kjeldsen, F. C. M.,1997. Visual Interpretation of Hand

Gestures as a Practical Interface Modality, PhD

Thesis, Grad. Sch. of Arts and Sci., Columbia Uni.

Lau,R., Li, F., and Ng, F., 2003. "VSculpt: A Distributed

Virtual Sculpting Environment for Collaborative

Design," IEEE Trans. on Multimedia, 5(4):570-580

McNeill, D., 2006. Gesture and Thought, The Summer

Institute on Verbal and Non-verbal Communication

and the Biometrical Principle, Sept. 2-12, 2006, Vietri

sul Mare (Italy), organized by Anna Esposito,

http://mcneilllab.uchicago.edu/pdfs/dmcn_vietri_sul_

mare.pdf (last accessed on 19.8.2012).

Mizuno, S., Okada, M., Toriwaki, J., 1999. Virtual

sculpting and virtual woodblock printing as a tool for

enjoying creation of 3d shapes. FORMA, volume 15,

number 3, pages 184-193, 409, September.

Mulder, A., 1996. Hand Gestures for HCI, Hand Centered

Studies of Human Movement Project, Technical

Report 96-1, School of Kinesiology, Simon Fraser

University, February.

Noy, N. F. and McGuinness, D. L. 2001. Ontology

Development 101: A Guide to Creating Your First

Ontology. Stanford Knowledge Systems Laboratory

Technical Report KSL-01-05 and Stanford Medical

Informatics Technical Report SMI-2001-0880, March.

Parry, S. R., 1986. Free-form deformations in a

constructive solid geometry modeling system. PhD

thesis. Brigham Young University.

Parent, R. E., 1977, A system for sculpting 3-D data. In

SIGGRAPH’77, volume 11, pages 138-147. ACM,

July.

Pederson, T., 2000. Human Hands as a link between

physical and virtual, Conf. on Designing Augmented

Reality Systems (DARE2000), Helsinore, Denmark,

12-14 April.

Salomon, D., 2005. Curves and Surfaces for Computer

Graphics. Springer Verlag August 2005. ISBN 0-387-

24196-5. LCCN T385.S2434 2005. xvi+461 pages.

Scrivener, S. A. R., Harris, D., Clark, S. M., Rockoff, T.,

Smyth, M., 1993. Designing at a Distance via Real-

time Designer-to-Designer Interaction, Design Studies,

14(3), 261-282.

Suwa, M., Gero, J. S., and Purcell, T.: 1998. Macroscopic

analysis of design processes based on a scheme for

coding designers' cognitive actions, Design Studies

19(4), 455-483.

Ugwu, O. O., Anumba, C. J., Thorpe, A., 2001. Ontology

development for agent-based collaborative design,

Engineering, Construction and Architectural

Management, Vol. 8 Iss: 3, pp.211 – 224.

KnowledgeEngineeringandOntologiesforObjectManipulationinCollaborativeVirtualReality

289