are presented in simple text format, however, the use
of distinct colors to represent variables, data values
and distinct types of atoms have been shown to be
very useful.
We also propose an auto-suggest feature during
the rule composition process. For that, we developed
an algorithm to determine the number of times each
predicate is related to another in a large rule set. To
do this, each predicate is mapped to a node in a
graph with edges connecting the predicates that
appear in the same rules. For each rule, all its
predicates are connected and counted. The auto-
suggest is based on the frequency that predicates are
related in the rules. When a user adds atoms to a rule
under construction, related atoms are suggested
based on this algorithm.
We are experimentally using Euclidian and
Manhattan distances (Salzberg, 1991) among the
rules with the aim of measuring rule similarity and
then group them based on it. We are using distinct
scenarios: using antecedent or consequent rule parts;
using both at the same time; and switching among
atoms and predicates (without variables).
The atoms/predicates form the columns in a
feature array and the rules/rule parts are the rows.
This feature array indicates how many times an
atom/predicate occurs in a rule/rule part. This
technique proved very efficient and useful for the
tested rule-based systems. It allows the discovery of
very similar or identical rules in a rule set and it also
finds rules similar to a given rule.
Finally, the rule similarity values were applied in
the development of a K-means (Jain, Murty and
Flynn, 1999) clustering method in order to group the
rules by similarities. With it, it is possible to
determine the number of groups and subdivide them
to get more closely related rule groups. Initial tests
demonstrate that the formed groups contain rules of
different sizes and with different atoms, what is
good. However, the K-means method can classify
the same rule in different groups and the insertion of
a new rule in the rule set requires a new
classification and therefore rearrangement. To
remedy this problem, we are studying the
development of other clustering methods.
6 CONCLUSIONS
The results obtained so far are good and they have
shown promising improvements in the creation,
visualization and maintenance of SWRL Rules. We
have been conducting studies in an attempt to use
restricted natural language and the next steps are the
development of tools that integrate these new
interfaces in a SWRL tab for Protégé.
ACKNOWLEDGEMENTS
This work has been funded by a grant from CNPq-
Brazil.
REFERENCES
Braye, L., Ramel, S., Grégoire, B., Leidner, S., Schmitt,
M., 2006. State of the Art Business Rules Languages.
Public Research Centre Henri Tudor,
http://efficient.citi.tudor.lu/cms/efficient/content.nsf/0/
4A938852840437F2C12573950056F7A9/$file/Busine
ssRulesLanguages_D3.1.pdf.
Hassanpour, S., O’Connor, M. J., Das, A. K., 2009.
Exploration of SWRL Rule Bases through
Visualization, Paraphrasing, and Categorization of
Rules. In RuleML, pp. 246–261, doi: 10.1007/978-3-
642-04985-9_23.
Jain, A. K., Murty, M. N., Flynn, P. J., 1999. Data
clustering: A review. In ACM Computer Survey 31, 3,
pp. 264-323, doi: 10.1145/331499.331.
O’Connor, M. J., Musen, M. A., Das, A. K., 2009. Using
the Semantic Web Rule Language in the Development
of Ontology-Driven Applications. In Handbook of
Research on Emerging Rule-Based Languages and
Technologies: Open Solutions and Approaches, ch.
XXII, pp. 525-539.
Rubin, D. L., Noy, N. F., Musen, M. A., 2007. Protégé: A
Tool for Managing and Using Terminology in
Radiology Applications. In Journal of Digital
Imaging, pp 34–46, doi: 10.1007/s10278-007-9065-0.
Salzberg, S., 1991. Distance Metrics for Instance-Based
Learning. In Proceedings of ISMIS'916th International
Symposium, Methodolo-gies for Intelligent Systems,
pp. 399-408.
SWRL Submission, 2004. http://www.w3.org/
Submission/SWRL.
Tu, S., Tennakoon, L., O’Connor, M. J., Shankar, R., Das,
A. K., 2008. Using an integrated ontology and
information model for querying and reasoning about
phenotypes: the case of autism. In Proceedings of the
American Medical Informatics Association, pp. 727–
731.
Zacharias, V., 2008. Development and verification of rule
based systems – a survey of developers. In Rule
Representation, Interchange and Reasoning on the
Web: International Symposium, pp. 6-16, doi:
10.1007/978-3-540-88808-6_4.
ICEIS 2011 - 13th International Conference on Enterprise Information Systems
194