AN ANALYSIS OF RULES-BASED SYSTEMS TO IMPROVE

SWRL TOOLS

Adriano Rivolli, João Paulo Orlando and Dilvan A. Moreira

Dept. of Computer Science, ICMC - USP, Av. Trabalhador Sancarlense, 400, São Carlos, Brazil

Keywords: Rule acquisition, Rule visualization, Rule management, OWL, SWRL.

Abstract: The Semantic Web renewed a growing interest in rule based software systems and their development.

Semantic Web Rule Language (SWRL) is a rule language that enables Horn-like rules to be combined with

Web Ontology Language (OWL) knowledge bases to provide even more expressivity. However, as rule

based web system mature, the number of rules they use grows making them difficult to manage. Developers

face problems when trying to understand and control the big rule sets they create. In order to address this

problem, techniques and tools are necessary to organize, view and create new rules as part of a large rule

set. This work presents strategies and techniques developed in order to improve SWRL tools based upon a

survey of rule tools, a study of the state of the art and the analysis of representative rule sets.

1 INTRODUCTION

The use of rules in the Semantic Web has grown and

contributed to renew and increase interest in rule

based software systems (Zacharias, 2008). The

Semantic Web Rule Language (SWRL) provides

even more expressivity to the Web Ontology

Language (OWL), which is a powerful language for

building ontologies that specify high-level

descriptions of Web content (SWRL Submission,

2004).

With the growing use of rules in the Semantic

Web, users and developers have encountered some

problems, particularly when the rule set becomes

large or the rules are complex (Hassanpour,

O’Connor and Das, 2009). Thus, they need tools for

the creation, management and visualization of rules

allowing: knowledge acquisition without ambiguity,

inconsistency and rule duplication; and rule (and

rule set) visualization that improves the

understanding of knowledge content.

Protégé is the most widely used freely available,

platform-independent technology for developing and

managing ontologies (Rubin, Noy and Musen,

2007). We take it as a representation of the state-of-

the-art in such tools. It uses two tabs for writing

SWRL rules: the SWRLTab (O’Connor, Musen and

Das, 2009) and Axiomé (Hassanpour, O’Connor and

Das, 2009). Both tools emphasize rule visualization.

However, their use has shown that support for large

SWRL rule bases is deficient and have to be

improved.

At the same time, there is a considerable number

of Business Rules System (BRS) that were built to

solve problems similar to the ones SWRL rule

systems face today. Unfortunately, there is little

research data about how rule systems are being

designed, written and debugged; and what

challenges their rule developers face (Zacharias,

2008).

This work is composed by: A survey of rule

tools, their main features and user interfaces; The

study of the state of the art related to SWRL; and

analysis of the rule set characteristics.

2 SWRL RULES

The SWRL format is a simple Horn-like rule

structure that combines with an OWL knowledge

base to provide more expressivity. Each rule consists

of two parts: the antecedent (body) and the

consequent (head) that are formed by zero or more

atoms. Atoms, in these rules, can be of the form

C(x), P(x,y), sameAs(x,y) or differentFrom(x,y),

where C is an OWL description, P is an OWL

property, and x,y are variables, OWL individuals or

OWL data values (SWRL Submission, 2004). The

W3C Submission defines six SWRL atom types:

class; individual property; data valued property; data

191

Rivolli A., Paulo Orlando J. and A. Moreira D..

AN ANALYSIS OF RULES-BASED SYSTEMS TO IMPROVE SWRL TOOLS.

DOI: 10.5220/0003439901910194

In Proceedings of the 13th International Conference on Enterprise Information Systems (ICEIS-2011), pages 191-194

ISBN: 978-989-8425-56-0

 2011 SCITEPRESS (Science and Technology Publications, Lda.)

range; same/different; and builtins (Hassanpour,

O’Connor and Das, 2009).

Atoms may refer to individuals, data literals,

individual variables or data variables. Each SWRL

atom type supports a number of arguments and

types. Finally, the human readable variables are

indicated using the standard convention of prefixing

them with a question mark. Using this syntax, a rule

would be written like this (SWRL Submission,

2004):

parent(?x,?y) ∧ brother(?y,?z) ->

uncle(?x,?z)

3 RULE TOOLS

Nowadays, most enterprise systems include a

module or use a program to write business rules. A

rule, in general, is a declarative instruction that

expresses “what” and not “how” things must happen

(Braye et al., 2006). Bellow we list and cluster the

rule tools surveyed for this paper:

BRS RuleTrack, FICO Blaze Advisor, ILOG

JRules BRMS, Oracle Business Rules and Visual

Rules provide a complete environment for managing

business rules that allow the capture and

organization of business rule statements.

Drools introduces the “Business Logic

integration Platform” which provides a unified and

integrated platform for rules. The LibRT Rule

Management System enables business experts to

maintain and test rules in different representation

formats. OpenLexicon is a Business Rules Engine.

The SAP NetWeaver BRM component provides

support for the various phases of a rule system life

cycle: design, execution, modification and

optimization of business rules.

RuleXpress is a repository-based tool that can

be used to manage vocabulary and rules. This tool is

built for business people and business analysts using

their vocabulary. It uses the BRS RuleSpeak, which

is a set of practical guidelines mainly expressing

rules in clear, unambiguous, well-structured business

English.

The TRANSlator from LAnguage TO Rules

(TRANSLATOR) allows anyone, even non-experts,

to write facts and rules in a formal representation for

use in the Semantic Web. This is accomplished by

automatically translating natural language sentences

written in Attempto Controlled English.

CLIPS is an interpreted language and expert

system tool that allows the definition of facts and

rules to which functions are applied.

Also included, are the SWRL tabs of Protégé,

Axiomé and SWRLTab. The first has some

functionality for visualization, acquisition, browsing

and exploring of SWRL rule bases and relationships.

The second supports edition and execution of SWRL

rules.

Apart from rule tools, there are also notations,

like the Object Rule Modeling (ORM) a graphic

notation, enabling the creation of rules with

diagrams. There are different tools to design,

maintain and execute rules written in ORM.

4 RULE INTERFACES

During the review of the rule tools, observations

were made regarding the interfaces, features and

resources used in these tools. In general, the

interfaces and interaction approaches are: Text

editor; Integrated Development Environment (IDE);

Descriptive user-friendly; Graphical editor; UML-

based; Spreadsheets; Tree editor; Combination with

ontologies; Automatic extraction from data

(Zacharias, 2008; SWRL Submission, 2004).

However, the survey also shows more specific

features and interfaces.

4.1 Interfaces

Decision tables are composed by rows and columns

of rules. They are used to display in tabular form all

possible situations that a decision rule might

encounter and to specify what actions to take in each

of these situations. The key point to keep in mind is

that, in a decision table, each row is a rule, and each

column in that row is either a condition or an action

for that rule.

Decision trees provide the same functionality as

decision tables, but are composed of nodes instead

of rows and columns. In a decision tree, each rule is

represented by the set of nodes going from the tree

root to each leaf.

Rule templates allow rule designers to write

rule logic (or structure) once and reuse it many

times. The main use of this kind of interface is to

acquire new rules.

From the tool user viewpoint, natural language

is the simplest interface for rules. The use of natural

language is mainly restricted to rule visualization

and not acquisition.

Diagrams can represent rules and rule sets,

providing the users ways to create and view rules.

The graphic representation can be as expressive as

ICEIS 2011 - 13th International Conference on Enterprise Information Systems

192

Entity Relationship (ER) and Unified Modelling

Language (UML) graphic notations.

A Rule Flow uses a Graphic Diagram to show a

Decision Tree. It uses geometric forms, colours and

symbols to represent parts of a rule.

Text editors are used by technical users that

write rules like programmers, in other words, the

user has to know the rule language syntax. Some

editors are like notepads, others are more

sophisticated as an IDE.

A shell is a piece of software that provides a user

interface that accepts and executes commands. It

supplies a command line interface that may be used

interactively or non-interactively.

4.2 Components and Features

Rule grouping is common in most rule tools. It

contributes for the organization and classification of

rules. There are two basic types: manual groups

(defined by the rule creator) and automatic groups

(defined by an algorithm).

Some rule-based systems allow naming and set

textual description of rules, so users can manage

them more efficiently. Those descriptions can make

use of lists, make searches and filter more

accurately, especially with a large rule set. Breaking

rules in parts (rule segments or atoms) can ease

visualization and acquisition.

Finally, icons and symbols are used to represent

tool features and assist rule development. Almost all

systems use icons and symbols in their interface as a

mean to access its features and functions. Icons and

symbols are applied in two distinct moments: to

provide access to system functionality or specific

tasks; and, to be directly used in rule visualization

and acquisition.

5 PRELIMINARY RESULTS

5.1 Rule Analysis

We chose the Autism Rules Phenolog ontology (Tu

et al., 2008) as a representative of a large and

complex SWRL rule system. This ontology of

autism extends some ontologies available in the

Open Biomedical Ontologies (OBO) Foundry using

a combination of description logic and rules. The

analysis of this rule system gave us a better

understanding of the structure of its parts, atoms and

variables. Table 1 presents the general data about the

156 rules analyzed.

Table 1: Overview of Autism Rules Phenolog.

Information Total Antecedent Consequent

Number of atoms 2094 994 1100

Number of distinct

predicate atoms

155 147 9

Average of atoms by

number of rules

13 6 7

Min and Max

number of atoms

10, 20 5, 15 4, 8

Number of

arguments

3826 1877 1949

Number of distinct

arguments

1304 997 1007

Min and Max

number of

arguments

18, 46 25, 37 25, 20

Min and Max

number of distinct

arguments

7, 18 5, 16 4, 10

This analysis also includes the frequency and

distribution of predicates, atoms, atom types and

argument types in rule parts (antecedents and

consequents) as well as in rules as a whole, which

allowed us to discover features of this rule set:

 Few predicate atoms occur in many rules while

the vast majority occurs in less than 10 rules

as observed;

 Only three predicate atoms occur more than

once in the same rule;

 Only one predicate appears on both sides of a

rule;

 Every rule analyzed does not contain

same/different or data range atom types,

individual property atoms are not applied to

antecedents and built-in types are not applied

to consequents;

 Every rule in the rule set contains, at least, two

class atoms, one individual property in the

consequent, five data valued property and one

built-in atom;

 The main argument types, used in the rules, are

Individual and data variables.

This data is being used to direct some of the

interfaces discussed in the next section.

5.2 Techniques, Strategies and Services

Although this work is still in progress, we have

achieved some results that are being carefully

evaluated and will be implemented in SWRL rule

tools. The tools SWRLTab and Axiomé tabs are our

starting point to propose improvements, new

techniques and features to support the SWRL

language.

A simple and useful feature, not used in both

tabs, is SWRL Highlights. Currently SWRL rules

AN ANALYSIS OF RULES-BASED SYSTEMS TO IMPROVE SWRL TOOLS

193

are presented in simple text format, however, the use

of distinct colors to represent variables, data values

and distinct types of atoms have been shown to be

very useful.

We also propose an auto-suggest feature during

the rule composition process. For that, we developed

an algorithm to determine the number of times each

predicate is related to another in a large rule set. To

do this, each predicate is mapped to a node in a

graph with edges connecting the predicates that

appear in the same rules. For each rule, all its

predicates are connected and counted. The auto-

suggest is based on the frequency that predicates are

related in the rules. When a user adds atoms to a rule

under construction, related atoms are suggested

based on this algorithm.

We are experimentally using Euclidian and

Manhattan distances (Salzberg, 1991) among the

rules with the aim of measuring rule similarity and

then group them based on it. We are using distinct

scenarios: using antecedent or consequent rule parts;

using both at the same time; and switching among

atoms and predicates (without variables).

The atoms/predicates form the columns in a

feature array and the rules/rule parts are the rows.

This feature array indicates how many times an

atom/predicate occurs in a rule/rule part. This

technique proved very efficient and useful for the

tested rule-based systems. It allows the discovery of

very similar or identical rules in a rule set and it also

finds rules similar to a given rule.

Finally, the rule similarity values were applied in

the development of a K-means (Jain, Murty and

Flynn, 1999) clustering method in order to group the

rules by similarities. With it, it is possible to

determine the number of groups and subdivide them

to get more closely related rule groups. Initial tests

demonstrate that the formed groups contain rules of

different sizes and with different atoms, what is

good. However, the K-means method can classify

the same rule in different groups and the insertion of

a new rule in the rule set requires a new

classification and therefore rearrangement. To

remedy this problem, we are studying the

development of other clustering methods.

6 CONCLUSIONS

The results obtained so far are good and they have

shown promising improvements in the creation,

visualization and maintenance of SWRL Rules. We

have been conducting studies in an attempt to use

restricted natural language and the next steps are the

development of tools that integrate these new

interfaces in a SWRL tab for Protégé.

ACKNOWLEDGEMENTS

This work has been funded by a grant from CNPq-

Brazil.

REFERENCES

Braye, L., Ramel, S., Grégoire, B., Leidner, S., Schmitt,

M., 2006. State of the Art Business Rules Languages.

Public Research Centre Henri Tudor,

http://efficient.citi.tudor.lu/cms/efficient/content.nsf/0/

4A938852840437F2C12573950056F7A9/$file/Busine

ssRulesLanguages_D3.1.pdf.

Hassanpour, S., O’Connor, M. J., Das, A. K., 2009.

Exploration of SWRL Rule Bases through

Visualization, Paraphrasing, and Categorization of

Rules. In RuleML, pp. 246–261, doi: 10.1007/978-3-

642-04985-9_23.

Jain, A. K., Murty, M. N., Flynn, P. J., 1999. Data

clustering: A review. In ACM Computer Survey 31, 3,

pp. 264-323, doi: 10.1145/331499.331.

O’Connor, M. J., Musen, M. A., Das, A. K., 2009. Using

the Semantic Web Rule Language in the Development

of Ontology-Driven Applications. In Handbook of

Research on Emerging Rule-Based Languages and

Technologies: Open Solutions and Approaches, ch.

XXII, pp. 525-539.

Rubin, D. L., Noy, N. F., Musen, M. A., 2007. Protégé: A

Tool for Managing and Using Terminology in

Radiology Applications. In Journal of Digital

Imaging, pp 34–46, doi: 10.1007/s10278-007-9065-0.

Salzberg, S., 1991. Distance Metrics for Instance-Based

Learning. In Proceedings of ISMIS'916th International

Symposium, Methodolo-gies for Intelligent Systems,

pp. 399-408.

SWRL Submission, 2004. http://www.w3.org/

Submission/SWRL.

Tu, S., Tennakoon, L., O’Connor, M. J., Shankar, R., Das,

A. K., 2008. Using an integrated ontology and

information model for querying and reasoning about

phenotypes: the case of autism. In Proceedings of the

American Medical Informatics Association, pp. 727–

731.

Zacharias, V., 2008. Development and verification of rule

based systems – a survey of developers. In Rule

Representation, Interchange and Reasoning on the

Web: International Symposium, pp. 6-16, doi:

10.1007/978-3-540-88808-6_4.

ICEIS 2011 - 13th International Conference on Enterprise Information Systems

194