gorithms for this purpose. Eclipse Memory Analyzer
or MAT (The Eclipse Foundation, 2010) is an exam-
ple of production quality heap dump analysis software
which is freely available. However, such offline anal-
ysis has several problems: heap dumps can be expen-
sive to acquire in production environment (because
generating a dump file requires freezing the applica-
tion) and heap dump files can be very big (up to sev-
eral gigabytes, depending on the memory configura-
tion). Because of the file size it can be hard to run
analysis software on a regular development machine.
Another drawback is the static nature of the memory
dump – there is no information regarding the source
of allocation of the objects, so finding the code re-
sponsible for memory leak is a separate task from just
finding leaked objects.
Another approach is to monitor certain collection
classes for unlimited growth. This approach relies
on bytecode instrumentation and one possible solu-
tion is described in (Xu and Rountev, 2008). This
technique is also used in several Application Per-
formance Monitoring (APM) suites. For example,
CA Wily Introscope
R
LeakHunter
TM
(CA Wily In-
troscope, 2010) and AppDynamics (AppDynamics,
2010). Both APM suites add some intelligence to ease
finding the cause of the memory leak. Unfortunately
there is no information about exact algorithms used in
them. Also, mentioned APM suites are targeted to the
Java Enterprise application and are not applicable for
example for desktop GUI applications.
As alternative to direct bytecode instrumentation,
aspect-oriented instrumentation may be used to find
the metrics needed for memory leak detection. Find-
Leaks tool is using AspectJ pointcuts for this purpose
to analyze references between objects and find out
leaking objects together with the sites of allocation
(Chen and Chen, 2007). In that paper only GUI ap-
plications were used for testing.
Profilers are often used in development for find-
ing memory leaks. Different profilers allow gathering
different metrics that may help finding memory leaks.
For example, profiler of the NetBeans IDE can obtain
object age which can then be used by human opera-
tor to apply statistical method (Sedlacek, 2010). This
data can be collected during object allocation profil-
ing. Major disadvantage of the profilers is the need for
qualified and experienced operator who can find the
actual leaking code. Inexperienced developer, given
the profiler, fact of the memory leak and reasonably
big code base would be arguably successful in this
process.
In addition to different instrumentation and byte
code modification techniques there are several re-
search projects applying different statistical meth-
ods for analyzing unnecessary references between ob-
jects: Cork, (Jump and McKinley, 2007) and stale ob-
jects: SWAT, (Chilimbi and Hauswirth, 2004). Cork
implements statistical memory leak detection algo-
rithm directly in the Virtual Machine by integrating
the method in the garbage collector itself. Cork has
achieved significantly small performance penalty –
only 2% and good results in memory leak detection
(Jump and McKinley, 2007). The only problem is
that this project is implemented as a module in the Re-
search Virtual Machine (RVM) Jikes (The Jikes RVM
Project, 2010), which makes it usable mostly in the
research community, as the industry is not very keen
anticipating the research VM.
Biggest disadvantage of these methods is the need
for qualified human operation to analyze gathered
data to find real place in the source code responsi-
ble for the memory leak. We think that this manual
decision and search process can also be automated.
3 STATISTICAL APPROACH TO
MEMORY LEAK DETECTION
Based on the review of related work we noted that
there is still space for the automated end-to-end mem-
ory leak detection solution that would work on the
HotSpot or OpenJDK Java Virtual Machines, would
maximally assist the developer by pinpointing both
allocation and reference points of leaking objects and
would do that also in the distributed and cloud envi-
ronments with little performance penalty so it could
be usable in production systems. Similar idea about
memory leak detection with statistical method is de-
scribed in (Formanek and Sporar, 2006) as an ex-
ample of application of dynamic Java byte code in-
strumentation. However, so far it hasn’t been imple-
mented end-to-end in any knownprofilers or scientific
publications.
There are several challenges for implementation
of this approach using standard tools:
• Gathering the data with low overhead during run-
time. As the number of objects during application
is huge (for example, specJVM 2008 benchmark,
which we used to test performance impact, during
its 2 hour run created 877 958 317 objects).
• Actually applying statistical method in real time
to detect classes suspected to be leaking.
• Apply dynamic byte code instrumentation to find
spots of allocation and most importantly referenc-
ing objects (as actually objects referencing leak-
ing ones are sources of the leaks rather than those
instantiating leaking objects).
CLOSER 2011 - International Conference on Cloud Computing and Services Science
624