could increase the medical knowledge by allowing
more efficient data analysis across trials (meta-
analysis), it could prevent duplicating the execution
of trials (duplicate detection) and it could aid a
clinician in finding relevant treatment options for a
patient (clinical decision support). Therefore, an
effective solution is required to represent structure
and store this information. We name the repository
storing this information the trial metadata repository.
By convention we call this information the trial
metadata to differentiate from the term “trial data”
which typically refers to the patient data collected
for a clinical trial.
1.1 State of Practice
The realization that information about clinical trials
should be publically available is nowadays common
ground. The World Health Organization (WHO)
publishes the WHO Trial Registration Data Set
(International Clinical Trials Registry Platform,
2013) which specifies the minimum amount of trial
information that must appear in a trials registry. The
WHO site also contains a collection of links to trial
registries (which are typically organized on a
geographical level).
In addition, various countries have made
legislation enforcing companies to publish clinical
trial information.
In the current state of practice, many trials
publish information on clinicaltrials.gov (i.e.
disease, intervention, eligibility criteria, etc.).
Unfortunately, these initiatives are focused on a
textual distribution of the information. We argue that
the information should be made accessible in a
machine interpretable way, allowing for
contextualization of the information given and
enabling a wide range of applications that rely on
access to structured trial information, such as
clinical decision support, trial recruitment, meta-
analysis of trial results, duplicate trial design
detection, etc.
Existing initiatives like linkedct.org (which aims
at publishing an open Semantic Web data source for
clinical trials data) are of limited use as the
information is post-processed from clinicaltrials.gov
and is rather course grained. To illustrate this, the
following criteria text excerpt has been retrieved
from linkedct.org, which is available as blob only
(i.e. not structured):
“DISEASE CHARACTERISTICS: - Histologically
proven metastatic renal cell carcinoma not
amenable to complete surgical resection and
progressive despite immunotherapy -
Bidimensionally evaluable clinically or
radiographically - HLA 6/6 or 5/6 matched family
donor available - No CNS metastases PATIENT
CHARACTERISTICS: Age: - 18 to 80 Performance
status: - ECOG 0 or 1 Life expectancy: - At least 3
months Hematopoietic: - Not specified Hepatic: -
Bilirubin no greater than 4 mg/dL - Transaminases
no greater than 3 times upper limit of normal Renal:
- Creatinine no greater than 2.5 mg/dL - No
malignancy-associated hypercalcemia (< 2.5
mmol/L) Cardiovascular: - Left ventricular ejection
fraction greater than 40% Pulmonary: - DLCO
greater than 65% of predicted Other: - Not pregnant
- HIV negative - No major organ dysfunction that
would preclude transplantation - No other
malignancies except basal cell or squamous cell skin
cancer - No psychiatric disorder or mental
deficiency that would preclude study participation
PRIOR CONCURRENT THERAPY: Biologic
therapy - See Disease Characteristics Chemotherapy
- Not specified Endocrine therapy - Not specified
Radiotherapy - Not specified Surgery - Not specified
Other - At least 1 month since prior treatment for
renal cell carcinoma.” (Hassanzadeh, 2013). This
unfortunately does not allow for contextualization or
processing.
At the same time, the Biomedical Research
Integrated Domain Group (BRIDG) Model
initiative (Biomedical Research Integrated Domain
Group Model, 2013) is gaining traction. The BRIDG
model is a domain analysis model which aims to
provide a shared view of the dynamic and static
semantics for the domain of protocol-driven research
and its associated regulatory artifacts. The BRIDG
model is a collaborative effort spanning important
and relevant standardization bodies like the Clinical
Data Interchange Standards Consortium (CDISC),
the HL7 Regulated Clinical Research Information
Management Technical Committee (RCRIM) Work
Group, the US National Cancer Institute (NCI), and
the US Food and Drug Administration (FDA). This
collection of stakeholders ensures a wide variety of
viewpoints on the model, which increases the
potential for stability of the model. In addition the
BRIDG model has the promise of easing future
interoperability as the various standardization bodies
are defining their new standards based in the BRIDG
model. As the BRIDG model is a domain analysis
model (and a conceptual model for clinical
research), it cannot be used “as is” to implement a
physical design or to generate code. Rather it can be
leveraged to further build out detailed logical models
and physical designs.
BRIDG currently spans the following specialized
HEALTHINF2014-InternationalConferenceonHealthInformatics
454