model. Metadata about original scientific papers are
directly stored in this entity. Data obtained from the
analysis of the papers are organized centred around
the entity patient. Therefore there are a relationship
between the entities paper and patient (a paper de-
scribes N patients and a patient is described in 1 pa-
per). Thousands of scientific papers about the muta-
tions associated with the Hypertrophic Cardiomyopa-
thy (HCM) have been published in journals, confer-
ences, etc. Moreover there are many internal docu-
ments in cardiology departments of the hospitals that
describe patients with these mutations. Both types of
data sources describe patients and information about
their relatives to a greater or lesser extent. Informa-
tion about these families can be entered in the system
in a very intuitive, progressive, and simple way (the
carefully designed user interfaces can be seen in Sec-
tion 3). Information about patients and their relatives
can be categorized in the following types:
• Identification Data. Both application internal
identifiers (patient identifier, family identifier,
etc.) and external domain identifiers (position in
the pedigree) are included in this category.
• Demographic Variables. Data about the sex, eth-
nicity, age at the diagnosis of the disease, etc. are
included here.
• Genotype. This category includes the results of
the genotype study. These results present the re-
lationship between the patients and the mutations
associated with the paper where the patient is de-
scribed. Obligate carrier, homozygous carrier,
normal carrier, and not carrier are the possible
values for this relationship. However, not all the
scientific papers describe this relationship for all
the mutations and patients. Therefore, an un-
known value is available for the relationship. This
philosophy is applicable for most of the variables
managed by the system.
• Phenotype. This category includes the results of
the different clinical tests that can be done in or-
der to determine the appearance of a patient re-
sulting from the interaction of the genotype and
the environment. There are several subcategories
in accordance with its nature. First, results about
the clinical diagnosis are collected. These results
determine whether the patient is affected or not
by some phenotypes, which phenotypes, etc. The
second group includes environmental factors or
triggers (alcohol, hypertension, tobacco, obesity,
etc.). There are many variables that can be de-
termined in a echocardiography,MRI, or autopsy.
These variables constitute the third group. Hyper-
trophy, dilatation, systolic and diastolic dysfunc-
tions are some examples of these variables. The
fourth group includes symptoms and risk factors
(dyspnea, chest pain, abnormal blood pressure
response, etc.). Variables of the ECG (rhythm,
pre-excitation, abnormal voltage or repolariza-
tion) constitute the fifth group. The sixth group
includes data about the electrophysiological study
(inducibility of malignant arrhythmias, conduc-
tion disturbance, etc.). Finally, the last two groups
include data about the treatment (medical treat-
ment, surgery, etc.) and the events (death, cere-
brovascular accident, etc.).
In brief, more than 200 variables are currently col-
lected about each patient. However, new variables of
interest can be easily introduced in the system.
2.3 Technology
This section briefly describes the most important tech-
nologies used in the development of the system. First,
Java 2 Platform, Enterprise Edition (J2EE) (Perrone
and Chaganti, 2003)(Bodoff, 2004) was the selected
development platform. J2EE is a widely-used plat-
form for server programming. This platform allows
developers to create portable and scalable applica-
tions. J2EE provides a set of technologies that make
the development process easier. JDBC (an API to
access relational databases), JavaServer Pages (JSP,
a technology to dynamically generate HTML), or
JavaServer Pages Standard Tag Library (JSTL, a tag
library for JSP) are several examples of such tech-
nologies provided by J2EE and used in this project.
Furthermore, other technologies can be easily inte-
grated with this platform. For example, Jakarta Struts
(Holmes, 2006), a framework that allows software en-
gineers to develop applications following the archi-
tectural patterns Model-View-Controller and Layers,
has been used. CSS (Shafer, 2003) is the technology
used to enhance the user interface. Finally, we have
widely used JavaScript (Flanagan, 2006) to improve
the dynamism and interaction of the user interface.
Technologies employed in the development of the
reports generation module deserve special mention.
eXtensible Stylesheet Language Formatting Objects
(XSL-FO) (W3C Recommendation, 2006) is the most
important technology used in this module. XSL-FO
is a mark-up language for XML document format-
ting which is most often used to generate reports. An
XSL-FO document is an XML document where the
format of a dataset is defined. This format defines the
presentation of these data in a paper, screen, or other
media. The XSL-FO document does not describe the
layout of the text on various pages. Instead, it de-
scribes what the pages look like and where the various
A DOCUMENT MANAGEMENT SYSTEM AND WORKFLOW TO HELP AT THE DIAGNOSIS OF
HYPERTROPHIC CARDIOMYOPATHY
7