5 CONCLUSION
The improvement of the healthcare IT infrastructures
has led to the creation of multiple applications aim-
ing to provide physicians and healthcare institutions
with the necessary tools to improve their individual
performance and level of care. These systems are
highly heterogeneous and are responsible for the cre-
ation of big pockets of data that end up being scattered
throughout the healthcare infrastructure.
Such pockets of data contain very valuable infor-
mation that could be put to further use, for example
they could be employed to assess the levels of per-
formance of each healthcare infrastructure at different
levels, ranging from the institutional level to the pro-
fessional performance of each individual healthcare
professional. It all depends on the quality and detail
of data that is being produced and flowing within the
institution information systems. However helpful this
information may be, very few hospitals are really pre-
pared to take advantage of every source of potential
piece of information the IT infrastructure produces.
As such, every day valuable information ends up be-
ing lost before it can be properly analysed and inte-
grated into some useful business metric.
We believe that our system takes one step fur-
ther and allows healthcare institutions to recover such
pockets of data and put them to good use by producing
very useful overall statistics about daily basis activi-
ties of the institution as a whole that would be other-
wise very difficult to determine.
We have described and implemented an architec-
ture for a system capable of incrementally building a
knowledge database for an healthcare facility based
on standard protocol messages transmitted through
the network. We were able to efficiently extract HL7
messages directly from the network with the addi-
tional advantage of not having to depend on physical
memory in order to reconstruct out of order packets
since we use the information contained in TCP head-
ers in order to calculate the precise point where each
piece of data fits in the content and use this informa-
tion to write packet collected data directly to the hard
disk. This allows us to process very long data streams
in a very efficient way.
We have been able to use the collected data mainly
for two different goals. From a monitoring point of
view, the data gathered can be used to find normal
levels of activities performance for a given healthcare
facility and with that information, one can easily de-
tect outliers that result from malfunctioning sections
of the healthcare infrastructure. A deeper analysis of
this data can also be used to support decision makings
from an administrative point of view.
We have also described a set of other uses for our
system architecture. Namely, after the message ex-
traction from the network, one can also build a net-
work of systems that could receive anonymized HL7
messages and produce, for example a new service
based on the data received such as HL7 semantic and
syntactic quality assessment.
5.1 Current Limitations
In terms of hardware, the system is heavily limited
by the processing capabilities of the Sniffer node at
several levels. That is, starting on the Network Inter-
face Card (NIC), we believe our overall system would
greatly benefit from the usage of a hardware capa-
ble of automatically associate each packet with an
extraction timestamp directly calculated from hard-
ware (Agarwal et al., 2003). With this, the cost of
timestamp association with each message could be
greatly reduced since in our current implementation,
such timestamp can only be calculated in user space.
Apart from the NIC on the sniffer node, one could
also benefit from using a CPU capable of offering
more computational power in order to reduce the
amount of time each packet needs to remain in user
space to be analysed. Also from an hardware point
of view, the usage of Solid-State Drive (SSD) hard
disks could also improve the overall performance of
our sniffer node, since the TCP stream reconstruction
is made directly on the hard drive in order to reduce
amount of physical memory needed.
From the software point of view, the current de-
ployed version of our system is unable to deal with
fragmented IP packets. As for now, our system simply
discards any packet fragmented at the network layer.
Tests have already been made in order to provide the
Sniffer node with the capability to reconstruct frag-
mented packets, however, the reconstruction of such
packets proved to be too slow when using the hard
disk.
Finally, related to the actual quality of the infor-
mation our system allows to take directly from the
network of the healthcare institution, the accuracy of
the extracted metrics may be compromised if many
correction messages are exchanged through the hos-
pital network. Nevertheless, if this happens, our sys-
tem should be able to detect this behaviour and there-
fore flag it as an unoptimized way for the institution
to work.
5.2 Future Work
As future work, we want to concentrate our efforts
in supporting more healthcare standards and be able
VisualizationofPassivelyExtractedHL7ProductionMetrics
429