Feldman, C. H., Hiraki, L. T., Liu, J., et al. (2013).
Epidemiology and sociodemographics of systemic
lupus erythematosus and lupus nephritis among US
adults with Medicaid coverage, 2000–2004. Arthritis &
Rheumatism, 65, 753–763.
Feldman, C. H., Hiraki, L. T., Winkelmayer, W. C., et al.
(2015). Serious infections among adult Medicaid
beneficiaries with systemic lupus erythematosus and
lupus nephritis. Arthritis & Rheumatology, 67, 1577–
1585.
Hak, A. E., Karlson, E. W., Feskanich, D., Stampfer, M. J.,
& Costenbader, K. H. (2009). Systemic lupus
erythematosus and the risk of cardiovascular disease:
Results from the nurses' health study. Arthritis &
Rheumatism, 61, 1396–1402.
Hintenberger, R., Falkinger, A., Danninger, K., &
Pieringer, H. (2018). Cardiovascular disease in patients
with autoinflammatory syndromes. Rheumatology
International, 38, 37–50.
Kim, S. C., Solomon, D. H., Rogers, J. R., et al. (2017).
Cardiovascular safety of tocilizumab versus tumor
necrosis factor inhibitors in patients with rheumatoid
arthritis: A multi-database cohort study. Arthritis &
Rheumatology, 69, 1154–1164.
Kim, S. Y., Servi, A., Polinski, J. M., et al. (2011). Validation
of rheumatoid arthritis diagnoses in health care utilization
data. Arthritis Research & Therapy, 13, R32.
Lenert, A., Oh, G., Ombrello, M. J., & Kim, S. (2020a).
Clinical characteristics and comorbidities in adult-onset
Still’s disease using a large US administrative claims
database. Rheumatology (Oxford).
Lenert, A., Russell, M. J., Segerstrom, S., & Kim, S.
(2020b). Accuracy of US administrative claims codes
for the diagnosis of autoinflammatory syndromes.
Journal of Clinical Rheumatology.
Liao, K. P., Ananthakrishnan, A. N., Kumar, V., et al.
(2015). Methods to develop an electronic medical
record phenotype algorithm to compare the risk of
coronary artery disease across 3 chronic disease
cohorts. PLoS One, 10, e0136651.
Liao, K. P., Cai, T., Gainer, V., et al. (2010). Electronic
medical records for discovery research in rheumatoid
arthritis. Arthritis Care & Research (Hoboken), 62,
1120–1127.
Liao, K. P., Cai, T., Savova, G. K., et al. (2015).
Development of phenotype algorithms using electronic
medical records and incorporating natural language
processing. BMJ, 350, h1885.
Liao, K. P., Diogo, D., Cui, J., et al. (2014). Association
between low density lipoprotein and rheumatoid
arthritis genetic factors with low density lipoprotein
levels in rheumatoid arthritis and non-rheumatoid
arthritis controls. Annals of the Rheumatic Diseases,
73, 1170–1175.
Liao, K. P., Sparks, J. A., Hejblum, B. P., et al. (2017).
Phenome-wide association study of autoantibodies to
citrullinated and noncitrullinated epitopes in
rheumatoid arthritis. Arthritis & Rheumatology, 69,
742–749.
McGonagle, D., & McDermott, M. F. (2006). A proposed
classification of the immunological diseases. PLoS
Medicine, 3, 12428.
Ramirez, A. H., Shi, Y., Schildcrout, J. S., et al. (2012).
Predicting warfarin dosage in European-Americans and
African-Americans using DNA samples linked to an
electronic health record. Pharmacogenomics, 13, 407–
418.
Ridker, P. M. (2016). From C-reactive protein to
interleukin-6 to interleukin-1: Moving upstream to
identify novel targets for atheroprotection. Circulation
Research, 118, 145–156.
Yu, S., Chakrabortty, A., Liao, K. P., Cai, T.,
Ananthakrishnan, A. N., Gainer, V. S., et al. (2017).
Surrogate-assisted feature extraction for high-
throughput phenotyping. Journal of the American
Medical Informatics Association, 24(e1), e143–e149.
Zhang, Y., Cai, T., Yu, S., Cho, K., Hong, C., Sun, J., et al.
(2019). High-throughput phenotyping with electronic
medical record data using a common semi-supervised
approach (PheCAP). Nature Protocols, 14(12), 3426–
3444.
Zheng, C., Rashid, N., Wu, Y. L., Koblick, R., Lin, A. T.,
Levy, G. D., & Cheetham, T. C. (2014). Using natural
language processing and machine learning to identify
gout flares from electronic clinical notes. Arthritis Care
& Research, 66(11), 1740–1748.
APPENDIX
The below table lists the features which were used in
all three of the final training algorithms along with the
gold-standard labels. Features with non-zero beta
coefficients for the ALASSO model are highlighted
in bold.
AIS features extracted from SAFE
Claims
code
M06.1, 714.20, 714.30, 136.1, M35.2, M04.2,
277.31, M04.1, M04.8, M04.9
UMLS
features
(CUI:
Concept
Names)
C0040423: tonsillectomy, C003864: anakinra,
C0042164: uveitis, C0151281: genital ulcers,
C0009262: colchicine, C0031350: pharyngitis,
C0009763: conjunctivitis, C0031154: peritonitis,
C0031046: pericarditis, C0152031: swollen joints,
C0149745: oral ulcers, C0037198: sinus
thrombosis, C0010592: cyclosporine, C1609165:
tocilizumab, C0027059: myocarditis, C0015974:
periodic fever, C0149744: oral lesions, C2718773:
canakinumab, C0031069: familial Mediterranean
fever, C0001416: adenitis, C0152026: retinal
vasculitis, C2343589: rilonacept, C0277799:
episodic fever, C0038363: aphthous stomatitis,
C0343068: familial cold autoinflammatory
syndrome, C1510431: superficial
thrombophlebitis, C3161802: pathergy test,
C0018784: sensorineural deafness, C1096155:
macrophage activation syndrome, C0268390:
muckle wells syndrome, C0847014: fever rash,
C0020641: hypopyon, C0424781: fever spikes,
C0742540: colchicine treatment
Identifying an Autoinflammatory Syndrome Cohort Using Natural Language Processing with Electronic Medical Record Data