can be seen that the “compound sentence” is
responsible for most of the errors.
4 CONCLUSION
Two new algorithms for identifying context in free-
text medical narratives are presented. It has been
shown that the new algorithms are superior to
traditional text classification algorithms for common
medical terms such as: Nausea, Abdominal pain,
Weight loss etc. Furthers research can be made in
order to test the suggested algorithms for any
medical concept. The Profile Based Learning
Algorithm is also very simple but still outperforms
other more complicated methods.
Reference to
the future
10%
Wrong
sentence
boundaries
6%
Negation
indicating
existence
3%
Compound
sentence
58%
Positive
adjective
23%
Figure 1: Distribution of Errors for the Profile Based
Learning Algorithm.
REFERENCES
Aronow D, Feng F, Croft WB. Ad Hoc Classification of
Radiology Reports. Journal of the American Medical
Informatics Association 1999; 6(5): 393-411.
Averbuch M, Karson T, Ben-Ami B, Maimon O. and
Rokach L., Context-Sensitive Medical Information
Retrieval, MEDINFO-2004, San Francisco, CA,
September 2004, IOS Press, pp. 282-286.
Cessie S. and van Houwelingen, J.C. , Ridge Estimators in
Logistic Regression. Applied Statistics 1997: 41 (1):
191-201.
Chapman W.W., Bridewell W., Hanbury P, Cooper GF,
Buchanann BG. A Simple Algorithm for Identifying
Negated Findings and Diseases in Discharge
Summaries. J. Biomedical Info. 2001: 34: 301–310.
Duda R. and Hart P., Pattern Classification and Scene
Analysis. Wiley, New York, 1973.
Fiszman M., Chapman W.W., Aronsky D., Evans RS,
Haug PJ., Automatic detection of acute bacterial
pneumonia from chest X-ray reports. J Am Med
Inform Assoc 2000; 7:593–604.
Fiszman M., Haug P.J., Using medical language
processing to support real-time evaluation of
pneumonia guidelines. Proc AMIA Symp 2000; 235–
239.
Friedman C., Alderson P, Austin J, Cimino J, Johnson S.
A General Natural-Language Text Processor for
Clinical Radiology. Journal of the American Medical
Informatics Association 1994; 1(2): 161-74.
Hersh WR, Hickam DH. Information retrieval in
medicine: the SAPHIRE experience. J. of the Am
Society of Information Science 1995: 46:743-7.
Hripcsak G, Knirsch CA, Jain NL, Stazesky RC, Pablos-
mendez A, Fulmer T. A health information network
for managing innercity tuberculosis: bridging clinical
care, public health, and home care. Comput Biomed
Res 1999; 32:67–76.
Keerthi S.S., Shevade S.K., Bhattacharyya C., Murth
K.R.K., Improvements to Platt's SMO Algorithm for
SVM Classifier Design. Neural Computation 2001:
13(3):637-649.
Lindbergh D.A.B., Humphreys B.L., The Unified Medical
Language System. In: van Bemmel JH and McCray
AT, eds. 1993 Yearbook of Medical Informatics.
IMIA, the Netherlands, 1993; pp. 41-51.
Mutalik P.G., Deshpande A., Nadkarni PM. Use of
general-purpose negation detection to augment
concept indexing of medical documents: a quantitative
study using the UMLS. J Am Med Inform Assoc
2001: 8(6): 598-609.
Myers E., An O(ND) difference algorithm and its
variations, Algorithmica Vol. 1 No. 2, 1986, p 251.
Nadkarni P., Information retrieval in medicine: overview
and applications. J. Postgraduate Med. 2000: 46 (2).
Pratt A.W. Medicine, computers, and linguistics.
Advanced Biomedical Engineering 1973: 3:97-140.
Quinlan, J. R. C4.5: Programs for Machine Learning.
Morgan Kaufmann, 1993.
Rokach L., Averbuch M., Maimon O., Information
Retrieval System for Medical Narrative Reports,
Lecture Notes in Artificial intelligence 3055, pp. 217-
228 Springer-Verlag, 2004.
Sebastiani F., Machine learning in automated text
categorization. ACM Comp. Surv., 34(1):1-47, 2002.
Van Rijsbergen, CJ.. Information Retrieval. 2nd edition,
London, Butterworths, 1979.
ICEIS 2006 - ARTIFICIAL INTELLIGENCE AND DECISION SUPPORT SYSTEMS
262