MEDICAL IMAGING IN A CLOUD COMPUTING
ENVIRONMENT
Louis Parsonson, Li Bai
School of Computer Science, University of Nottingham, Jubilee Campus, Wollaton Road, Nottingham, NG8 1BB, U.K.
Laurence Bourn, Atif Bajwa, Soeren Grimm
Biotronics3D, 16 Heron Quays, Canary Wharf, London, E14 4JB, U.K.
Keywords: Medical imaging, Volume rendering, Cloud computing.
Abstract: In this paper we present a cloud computing environment for medical imaging which deals with the issues of
scaling a traditionally single-user solution to a software-as-a-service solution. We will first introduce
volume rendering for medical imaging, and the issues with volume rendering of medical images on the
cloud. We will then describe our method for accelerating CPU based volume rendering on the cloud and for
scaling the system to a software-as-a-service solution.
1 INTRODUCTION
Over the past few years, there has been a change in
the way software solutions are delivered. Provision
of software has moved from locally installed
systems to remotely invoked virtual instances,
accessed through a web browser or alike. The
motivation is a paradigm shift due to increased
prevalence of the Internet, an increase in the
commoditization of IT hardware, and pressure to cut
IT budgets. Consequently, software is no longer
delivered as a product, but offered as a service, i.e.
Software as a Service (SaaS) (Jaekel et al, 2010).
The removal of specialised hardware
requirements and its associated capital expenditure
are both instrumental to the success of this strategy.
SaaS requires service providers to assume risks
associated with start-up costs, while facilitating
consumers with immediate access to purchased
software solutions: first contact with the product
involves gaining access to the services as opposed to
cumbersome, failure prone, invasive installation
procedures. Successful examples of the SaaS
paradigm include Google Docs, a web-based text
editing solution freely available over the internet,
VPS NET, a virtual private server hosting company,
and Facebook, a social networking website. For
many of these services, access only requires account
creation and access to an internet-enabled computer
Centralized information can be manipulated and
shared in a flexible and transparent manner.
Additionally, the service provider can update and
modify the service without requiring interruption or
active participation of the user. SaaS solutions are
also suitable for the ‘pay-as-you-go’ charging
scheme, which allows users to create a subscription
which can be cancelled and/or upgraded at any time,
without incurring administrative costs to the solution
provider.
SaaS has interesting implications for medical
imaging applications. State-of-the-art systems are
currently installed on high-end, standalone
workstations, often requiring bespoke hardware.
This entails an overburden in administration of load
balancing and redundancy control. These standalone
workstations are often idle for large periods of time,
rendering them costly, inefficient and promptly
outdated. Additionally, due to their cost, they are
often shared, causing scheduling conflicts, and
reducing overall efficiency.
A medical imaging cloud offers an alternative
solution. It removes the need for expensive
workstations, although presents difficulties of its
own. The processing power required for a medical
imaging application is more than a cloud solutions
have typically provided. In this paper we show how
327
Parsonson L., Bai L., Bourn L., Bajwa A. and Grimm S..
MEDICAL IMAGING IN A CLOUD COMPUTING ENVIRONMENT.
DOI: 10.5220/0003383803270332
In Proceedings of the 1st International Conference on Cloud Computing and Services Science (CLOSER-2011), pages 327-332
ISBN: 978-989-8425-52-2
Copyright
c
2011 SCITEPRESS (Science and Technology Publications, Lda.)
to effectively apply the cloud computing model to a
medical imaging platform.
2 MEDICAL IMAGING
Medical imaging deals with the capture and analysis
of images of the body, taken using equipment
utilising electromagnetic radiation. This covers
many techniques including x-ray based methods
such as radiography and Computed Tomography
(CT), ultrasound, Magnetic Resonance Imaging
(MRI), and more. Images in this context are often
referred to as slices since they describe a cross-
section slice of a particular part of the body. While
some imaging techniques result in a single image,
for instance an x-ray of a broken bone, others can
produce multiple images, such as an MRI scan. If a
scan acquires multiple images, the resulting
collection is referred to as a series. In some cases,
multiple series of images are taken over a period of
time. In all of these cases, the collected scan data of
a single patient is called a study.
These studies are then sent for analysis by a
specialist. This specialist will have a set of such
studies to analyse, and after completing analysis will
send the results to the patients physician for further
treatment. Traditionally, studies are viewed as single
images or as a set of consecutive slices (Figure 1.).
However, a single study can comprise of as many as
a thousand images, resulting in more than 1.5
gigabytes of data. Because of this, difficulties can
arise when processing scans on a slice by slice basis.
The use of volumes is therefore desirable. By
assigning a thickness to each slice in a scan (which
can be determined by the interval at which the
patient was scanned) these slices can be composited
Figure 1: Display showing multiple images in a series.
to a volume representation of the patient’s body.
This volume can then be used to produce a three-
dimensional visualisation of the patient through
volume rendering.
3 VOLUME RENDERING
Volume rendering is the method by which
projections of 3D volumes are displayed as 2D
images. Early implementations of volume rendering
techniques focused on rendering of texture data,
initially as a set of blended 2D textures and later, as
the hardware permitted, utilising 3D textures
(Dachille et al, 1998), before ray casting was
implemented (Kruger et al, 2003).
3.1 Direct Volume Rendering
Direct Volume Rendering (DVR) generates images
without the need to create an intermediate polygonal
representation of a volume. Instead, the volume data
set is projected onto an image plane. In image-space
oriented ray casting approaches, rays are cast from
the view-point through the view-plane into the
volume, see Figure 2. The volume is equidistantly
sampled along the ray and the volume integral is
computed by repeated accumulation of colours and
opacities.
Figure 2: Ray casting.
At every sampling position a scalar value is
interpolated between the corresponding surrounding
eight voxels, that is, in the logical three dimensional
extension of a pixel. This value is then classified
according to a transfer function. If the sample is
non-transparent, a gradient is computed from the
surrounding voxels, in order to apply shading.
Finally, the sample is composited with the previous
samples of the ray. Figure 3 shows a typical example
of a volume render: (a) shows a render of the full
CLOSER 2011 - International Conference on Cloud Computing and Services Science
328
volume; (b) and (c) show renders containing
segmentation information where a second transfer
function has been applied.
Figure 3: Volume render.
Three main volume rendering approaches can be
distinguished. Two of them are hardware based; the
first one utilizes high-end Graphics Processing Units
(GPUs) (Heng et al, 2005); the second requires
special purpose hardware (Shen et al, 2007); the
third is CPU based volume rendering.
3.2 Volume Rendering on the GPU
GPUs are highly parallel optimisations of processor
architecture, and ray casting has been shown to work
well on GPUs thanks to this parallelism. In hand
with this are recent advances in hardware which
have made modification of the graphics pipeline
commonplace.
In a GPU based implementation of a volume
renderer, entry points of rays intersecting a volume
are computed by rendering the front faces of the
bounding cube of the volume as a colour map. The
back faces are then rendered in a similar manner to
calculate the direction of each ray through the
volume. Using this information the process then
steps through the volume along the direction of the
ray, interpolating tri-linearly to calculate colour
values (Kruger et al, 2003).
Thanks to GPU architecture being aimed at
floating point mathematics, these operations can be
somewhat faster than they would be on the CPU. It
does, however, involve transfer of large amounts of
data from main memory into video memory, which
can cause significant slow-down. In general,
graphics hardware often has access to less memory
than the CPU. Today’s best cards have a maximum
of 6GB (for example: Nvidia Tesla M2070) in
comparison with multi-core CPU servers where
256GB is not uncommon. Rendering of many data
sets would therefore require frequent swapping of
data between main and video memory. In addition,
most GPU APIs are focused on computer game
technologies, where precision and concurrency are
rarely an issue.
3.3 Volume Rendering in Hardware
Operations which would take time to emulate in
software or on a GPU can be specifically mapped
onto the hardware to increase performance
(Meissner et al, 2001). Volume rendering hardware
is specifically optimised for the task at hand.
Pure hardware based volume rendering solutions
provide real-time performance and high quality
rendering. Consequently they are the most applied
approach in practice. Current high-end solutions
offer high performance on volumes of 512
3
voxels,
with as much as four gigabytes of dedicated memory
and options for clustering machines together. These
can, however, involve using imaging applications
specific to the hardware manufacturer, tying users to
a single vendor.
3.4 Volume Rendering on the CPU
Rendering on the CPU is the obvious choice in some
applications, and in fact was the only choice for
some time, originating before rendering on the GPU
was even possible (Roth, 1982). Since this first
implementation many algorithmic advances have
been made.
One such advance is the shear warp algorithm
(Lacroute et al, 1994), which accelerates rendering
by modifying the shape of the volume in such a way
that voxels are aligned in parallel to the image plane.
Rays cast no longer need to worry about
interpolation since each one passes perfectly though
the centre of a line of voxels, with each voxel being
a single step along the ray. This leads to very good
cache coherency during volume traversal greatly
improving the speed. However, because only one
sample is made for each voxel the resulting image
quality is low and insufficient for medical analysis.
There are many advantages to rendering on the
CPU. Advanced visualization systems provide pre-
processing features such as filtering, segmentation,
and morphological operations, among others. If such
operations are not supported by the hardware, they
have to be performed on the CPU and data must then
be transferred back to the hardware. This transfer is
very time consuming, thus interactive feedback
becomes problematic. In contrast, within a pure CPU
based solution this transfer is unnecessary allowing
MEDICAL IMAGING IN A CLOUD COMPUTING ENVIRONMENT
329
more efficient processing of data (Grimm et al,
2004).
4 EFFICIENT RENDERING ON
THE CLOUD
4.1 Volume Rendering on the Cloud
Volume rendering offers a number of challenges,
and this is reflected when scaling to a large multi-
user solution such as a cloud. Memory is an
important consideration. In a cloud environment
each user needs access to enough memory to ensure
that the system continues to run smoothly. A typical
study can contain multiple series, of which an
average of three hundred images per series is
common. This results in each user requiring, on
average, four gigabytes of memory in order to work.
Loading of large data sets takes time, an issue which
needs to be addressed in a cloud system. In addition,
rendering of large data sets can take time without an
appropriate acceleration structure. Furthermore, in a
cloud environment dedicated hardware-based
solutions become prohibitively expensive: setup
cost, maintenance, and even scalability become
limited due to hardware constraints. Finally, it is
important to remember that while GPU architecture
is highly parallelised, it still does not support
multiple concurrent user access, so image requests
would have to be served sequentially. Thus, a pure
CPU based solution is by far the most suitable, and
probably the only truly viable solution, for cloud
based rendering.
4.2 Accelerating CPU Rendering
To accelerate CPU based rendering and image
processing, the underlying memory management has
to be modified. In this case we utilise a bricked
memory layout. Cross-sectional data, e.g. CT and
MRI, are large sets of individual images which
combined form a volume in space. Physical memory
is typically constructed in a sequential way,
therefore the straightforward approach to loading
these images into memory is to put them one after
the other using a linear layout (Figure 4a):
This layout has several disadvantages. In a typical
set of cross-sectional images, an average 30 percent
of the data actually does not contain any useful
information. This comes from the fact that the
human body consists of a set of tubular structures
(e.g. arms, legs, and torso). A cross- sectional cut
through a tubular structure using rectangular images
(a) Linear layout (b) Block volume layout
Figure 4: Memory layout.
does not contain the data well, leaving vast amounts
of data to represent empty space around the body.
Furthermore, in the case of an advanced medical
imaging application data often needs to be processed
in a non-sequential way. Volume ray casting has a
strong view-dependent data access pattern, and
consequently, taking a look at the typical cache
hierarchy of today’s CPU (L1, L2, L3) it becomes
clear that storing images linearly in memory would
cause complete cache thrashing.
In order to address the aforementioned issues a
significant improvement is gained if the cross-
sectional data is arranged in a blocked manner. In
this case we subdivide and reorganize the entire
volume (one 3D-image) into smaller contiguous
lightweight bricks, obtaining a structure analogous
to a Rubik’s cube (Figure 4b). A blocked memory
layout exhibits a variety of advantages, the first of
which is saving memory. Data which contains no
information can be merged into a single block -
blocks are implemented in a reference counted
manner. Furthermore, data can be processed in a
brick-wise manner.
Figure 5: Volume ray casting system exploiting thread-
level parallelism speedup: two physical CPUs, each with
two cores.
CLOSER 2011 - International Conference on Cloud Computing and Services Science
330
Figure 5 shows how volume ray casting can be
significantly accelerated by employing a brick-wise
processing scheme. Not only is this scheme and
memory layout considerably more cache friendly, it
also has an inherent multi-threadability, all of which
are essential to a successful cloud implementation of
the rendering strategy. Additionally, considering
modern hyper-threading technology in which there
is a duplicated ALU but a shared cache, it becomes
evident that in order to benefit from multiple cores
constant re-fetching of data from physical memory
has to be avoided.
5 SCALING TO A CLOUD BASED
SAAS SOLUTION
One of the main problems faced when sharing
hardware and software resources between multiple
users in an arbitrary manner is the robust and
efficient administration of the hardware available.
Not only must each user have secured data storage
and privacy protection, but it must also be able to
exploit its resources without having to directly
control how the underlying hardware and software
resources are being utilized. Cloud systems are an
example of a technology that requires a managing
entity that ‘virtualizes’ the usage of hardware and
software in a way that each user has a direct and
transparent interaction with the system. The
challenge is to build a lightweight instrument that
allows for seamless interaction efficiently. A dual
strategy that not only permits automatic
virtualization of the resources but also specializes in
distributing them in a coherent manner by
performing a ‘load balancing’ of the tasks on the
available resources is required.
Virtualization, as described here, is instrumental
to managing secure user sessions and is fundamental
to the efficient distributed rendering required to
perform advanced imaging applications. In
particular, virtualization is achieved by creating
‘sand-boxes’, a concept that provides restricted
resource sets to individual users including controlled
access to data storage, hardware resources and
networking privileges. This creates a local, virtual
machine for each user and removes the burden of
requiring them to manage how their tasks are
processed by the system.
In order to achieve this securely and efficiently,
the cloud system is split into the following sections:
a Global Session Manager (GSM) responsible for
managing user specific session sandboxes and a
View Session Manager (VSM) responsible for
managing viewing session sandboxes and load-
balancing. The load-balancing itself is done by the
Rendering Resource Load Balancer (RRLB), which
is part of the VSM. Both the GSM and VSM are
deployed as web services and can be mirrored for
redundancy.
Figure 6: Virtualisation in the cloud.
A typical user interaction with the system is as
follows: as a user successfully logs in at which point
they are assigned a global session unique identifier
(GSUID). This GSUID allows the user to request
within the GSM a user specific session sandbox (see
Figure 6). The session sandbox holds all permissions
and settings for that user within the entire system
(data- and hardware-wise). It can only be accessed
by that user. Furthermore, this session sandbox is
controlled by a configurable sliding expiration. By
default, user inactivity for more than 30 minutes will
immediately remove the session sandbox, effectively
logging the user out. When the user wishes to view a
study, assuming they have the correct permission
level, the VSM requests that the RRLB return a
suitable rendering node resource. A new rendering
session is created and added to the user session
sandbox. The rendering resource starts one or more
rendering processes based on the study data. The
RRLB decides on which rendering node the image
generator is created based on permissions of which
node can be used by the user and the current load on
nodes (number of users, available memory, CPU
MEDICAL IMAGING IN A CLOUD COMPUTING ENVIRONMENT
331
utilization, etc.). The viewing session sandbox is
also controlled by the configurable sliding
expiration.
It is important, also, to consider the bandwidth
needs of a cloud system dealing with large amounts
of data. Such a system needs to be able to serve
multiple users concurrently, as well as transfer data
between internal components quickly. Due to the
nature of the application, much of the traffic is in the
format of images, whether this is renders being sent
from the server to the client, or scan data being
uploaded from the client to the server. Even in
compressed formats, image data takes a large
amount of bandwidth to transmit quickly, which can
have a significant impact on performance.
6 RESULTS AND CONCLUSIONS
This solution was implemented in the Biotronics3D
cloud, and is currently running as 3dnetmedical. A
single high-end server in the cloud can serve as
many as 64 users concurrently, showing just how
successful this solution is. Being a cloud, this
solution is scalable, so any combination of servers
can be combined for greater effect. The scalability of
the cloud is an important feature, since it inherently
implies a cost effective solution. At any time
additional nodes can be added to the cloud to make
it more powerful and the cost per user is much
reduced compared to that of buying individual
workstations.
Figure 7: Overview of cloud infrastructure.
The infrastructure on which the system was
implemented was comprised primarily of a firewall,
for security purposes, an IIS server, a rendering
cluster and a storage cluster. Both the rendering
cluster and the storage cluster can be expanded at
any time to cope with an increased load of users or
data. Both the rendering and storage clusters accept
service requests from the IIS server, since each
cluster is specifically optimised for the task it
performs (for instance series uploads go straight to
the storage cluster, and not through the rendering
cluster) (Figure 7).
Users can be classified as one of three types:
casual users, active users, and power users. Whilst a
power user may be using computationally expensive
features of the system, e.g., choosing
transformations and transfer functions, invoking the
rendering cluster, casual users could be simply
viewing an image already rendered to the screen.
Thus, while a 32-core machine with 64 users would
imply less than a single core per user, in reality this
is not the case. Memory is in fact the limiting factor.
REFERENCES
Jaekel M., Pott H., 2010, Cloud Computing – Software as
a Service in Practice, Siemens
Jaekel M., Luhn A., 2009, Cloud Computing – Business
Models, Value Creation Dynamics and Advantages for
Customers, Siemens.
Shen, R., Boulanger, P., 2007, Hardware-accelerated
volume rendering for real-time medical data
visualization, Lecture Notes in Computer Science,
Volume 4842/2007, 801-810.
Heng, Y., Gu, L., 2005, GPU-based Volume Rendering for
Medical Image Visualization, Engineering in Medicine
and Biology Society, IEEE-EMBS 2005. pp. 5145-
5148.
Grimm S., Bruckner S., Kanitsar A., Gröller E., 2004, A
refined data addressing and processing scheme to
accelerate volume raycasting, Institute of Computer
Graphics and Algorithms, Vienna University of
Technology, Computers & Graphics 28, 2004, pp 719-
729
Kruger, J., Westermann, R., 2003, Acceleration
Techniques for GPU-based Volume Rendering,
Computer Graphics and Visualisation Group,
Technical University Munich.
Meissner M., Grimm S., Strasser W., Packer J., Latimer
D., Parallel volume rendering on a single-chip SIMD
architecture, IEEE 2001 symposium on parallel and
large-data visualization and graphics, San Diego,
California, USA.
Dachille F., Kreeger K., Baoquan C., Bitter I., Kaufman
A., 1998, High-Quality Volume Rendering Using
Texture Mapping Hardware, ACM SIGGRAPH/
EUROGRAPHICS workshop on Graphics hardware,
Lisbon, Portugal, 1998.
Drebin, R., Carpenter, L., Hanrahan, P., 1988, Volume
rendering, SIGGRAPH '88 Proceedings of the 15th
annual conference on Computer graphics and
interactive techniques, Atlanta, Georgia, 1988.
Roth S., 1982, Ray Casting for Modelling Solids,
Computer Graphics and Image Processing, Volume
18, pp. 109-144.
CLOSER 2011 - International Conference on Cloud Computing and Services Science
332