was used to count the number of edits necessary for
the final output string to be transformed into a 100%
match. An average of 2.48 edits per sample was
found to be necessary to have optimal results.
7 CONCLUSIONS
This report outlines a pill identification system that
achieves a higher degree of automatic identification
than previously reported. Further improvement is
needed prior to practical application.
7.1 More Image Testing
Images in other databases, especially those taken in
the field, have variable lighting and focus. It is likely
that our successful segmentation accuracy, with a
median error of 2.2%, will fail when algorithms are
applied to other images. Other algorithms and
additional color space dimensions such as the b*
dimension in L*a*b* color space will be attempted.
7.2 Color Recognition Improvements
Color recognition accuracy measured by logistic
regression, with a current error of 1.9%, is expected
to fail with other images. Future steps to improve
color recognition are more image blurring, RGB
histogram normalization before processing, and
adding L*a*b* to the current list of channels. We
will explore other adaptive methods to ensure that
data is not lost in the averaging method.
7.3 Shape Recognition Improvements
A secondary problem concerns shape recognition.
Of twelve shape types, the three most common are
prevalent enough such that that uncommon shapes,
e.g. teardrop or pentagonal shapes, are under-
selected. A special function for these uncommonly
shaped pills is needed.
7.4 Imprint Recognition Improvements
The algorithm for imprint extraction that has been
outlined suggests a two-part system. First, the image
should be processed in order to improve raw OCR
results. Secondly, the OCR output string should be
analyzed to limit the final output to a finite
vocabulary. Preliminary efforts have been
inconclusive. The optimal number of mathematical
morphology operations, such as repeated dilation or
erosion to produce the best results for a given image,
is not known. This currently relies largely on human
input. The techniques in string matching could also
be improved in returning only the relevant
information and excluding words of little value.
7.5 Improvements for a Practical
System
Multiple challenges must be met to complete a
working system. Fusion of the information from
shape, color, and character determination will be
needed. The images in the Pillbox database are of
higher quality than can be obtained with a
smartphone under real-life conditions. Overcoming
non-ideal lighting, irregular positioning, and limited
resolution are additional challenges that must be met
before a practical system is available for health and
law enforcement.
REFERENCES
Apostolico, A., & Galil, Z. (1997). Pattern matching
algorithms. Oxford: Oxford University Press, p. 123-
125.
Arthur, D., & Vassilvitskii, S. (2007). K-means++: The
Advantages of Careful Seeding. In Proceedings of the
eighteenth annual ACM-SIAM symposium on Discrete
algorithms, 1027-1035.
Gonzalez, R. C., & Woods, R. E. (2008). Digital Image
Processing (3
rd
ed.). New Jersey: Pearson Education.
Hartl, A. (2010). Computer-Vision Based Pharmaceutical
Pill Recognition on Mobile Phones. CESCG 14
th
Central European Seminar on Computer Graphics.
Hu, M.-K. (1962). Visual pattern recognition by moment
invariants. IRE Transactions on Information Theory,
8(2), p. 179-87.
Itseez. (2012). OpenCV. Open Source Computer Vision
Library. http://www.opencv.org
Lee, Y., Park, U., Jain, A. K., & Lee, S. (2012). Pill-ID:
Matching and retrieval of drug pill images. Pattern
Recognition Letters, 33(7), p.904-910.
Menard, S. (2001). Applied Logistic Regression (2
nd
ed.).
Thousand Oaks: Sage Publications, Inc.
Moore, T. J., Cohen, M. R., & Furberg, C. D. (2007).
Serious adverse drug events reported to the Food and
Drug Administration, 1998-2005. Archives of Internal
Medicine, 167(16), 1752-9.
Smith, R. (2012). Tesseract Code. http://code.google.com/
p/tesseract-ocr
Szeliski, R. (2011). Computer Vision: Algorithms and
Applications. New York: Springer.
Umbaugh, S. E. (2011). Digital Image Processing and
Analysis (2
nd
ed.). Boca Raton: CRC Press.
United States National Library of Medicine. (2012). Pill
Beta. National Institutes of Health.
Xu, R., & Wunsch, D. (2005). Survey of clustering
AutomaticPillIdentificationfromPillboxImages
383