Skip Navigation
Skip to contents

Epidemiol Health : Epidemiology and Health

OPEN ACCESS
SEARCH
Search

Search

Page Path
HOME > Search
3 "Algorithm"
Filter
Filter
Article category
Keywords
Publication year
Authors
Funded articles
Original Article
Predicting over-the-counter antibiotic use in rural Pune, India, using machine learning methods
Pravin Arun Sawant, Sakshi Shantanu Hiralkar, Yogita Purushottam Hulsurkar, Mugdha Sharad Phutane, Uma Satish Mahajan, Abhay Machindra Kudale
Epidemiol Health. 2024;46:e2024044.   Published online April 13, 2024
DOI: https://doi.org/10.4178/epih.e2024044
  • 3,163 View
  • 92 Download
AbstractAbstract PDFSupplementary Material
Abstract
OBJECTIVES
Over-the-counter (OTC) antibiotic use can cause antibiotic resistance, threatening global public health gains. To counter OTC use, this study used machine learning (ML) methods to identify predictors of OTC antibiotic use in rural Pune, India.
METHODS
The features of OTC antibiotic use were selected using stepwise logistic, lasso, random forest, XGBoost, and Boruta algorithms. Regression and tree-based models with all confirmed and tentatively important features were built to predict the use of OTC antibiotics. Five-fold cross-validation was used to tune the models’ hyperparameters. The final model was selected based on the highest area under the curve (AUROC) with a 95% confidence interval (CI) and the lowest log-loss.
RESULTS
In rural Pune, the prevalence of OTC antibiotic use was 35.9% (95% CI, 31.6 to 40.5). The perception that buying medicines directly from a medicine shop/pharmacy is useful, using antibiotics for eye-related complaints, more household members consuming antibiotics, and longer duration and higher doses of antibiotic consumption in rural blocks and other social groups were confirmed as important features by the Boruta algorithm. The final model was the XGBoost+Boruta model with 7 predictors (AUROC, 0.934; 95% CI, 0.891 to 0.978; log-loss, 0.279) log-loss.
CONCLUSIONS
XGBoost+Boruta, with 7 predictors, was the most accurate model for predicting OTC antibiotic use in rural Pune. Using OTC antibiotics for eye-related complaints, higher consumption of antibiotics and the perception that buying antibiotics directly from a medicine shop/pharmacy is useful were identified as key factors for planning interventions to improve awareness about proper antibiotic use.
Summary
Special Article
Identification of acute myocardial infarction and stroke events using the National Health Insurance Service database in Korea
Minsung Cho, Hyeok-Hee Lee, Jang-Hyun Baek, Kyu Sun Yum, Min Kim, Jang-Whan Bae, Seung-Jun Lee, Byeong-Keuk Kim, Young Ah Kim, JiHyun Yang, Dong Wook Kim, Young Dae Kim, Haeyong Pak, Kyung Won Kim, Sohee Park, Seng Chan You, Hokyou Lee, Hyeon Chang Kim
Epidemiol Health. 2024;46:e2024001.   Published online December 26, 2023
DOI: https://doi.org/10.4178/epih.e2024001
  • 7,697 View
  • 168 Download
  • 1 Crossref
AbstractAbstract AbstractSummary PDF
Abstract
OBJECTIVES
The escalating burden of cardiovascular disease (CVD) is a critical public health issue worldwide. CVD, especially acute myocardial infarction (AMI) and stroke, is the leading contributor to morbidity and mortality in Korea. We aimed to develop algorithms for identifying AMI and stroke events from the National Health Insurance Service (NHIS) database and validate these algorithms through medical record review.
METHODS
We first established a concept and definition of “hospitalization episode,” taking into account the unique features of health claims-based NHIS database. We then developed first and recurrent event identification algorithms, separately for AMI and stroke, to determine whether each hospitalization episode represents a true incident case of AMI or stroke. Finally, we assessed our algorithms’ accuracy by calculating their positive predictive values (PPVs) based on medical records of algorithm-identified events.
RESULTS
We developed identification algorithms for both AMI and stroke. To validate them, we conducted retrospective review of medical records for 3,140 algorithm-identified events (1,399 AMI and 1,741 stroke events) across 24 hospitals throughout Korea. The overall PPVs for the first and recurrent AMI events were around 92% and 78%, respectively, while those for the first and recurrent stroke events were around 88% and 81%, respectively.
CONCLUSIONS
We successfully developed algorithms for identifying AMI and stroke events. The algorithms demonstrated high accuracy, with PPVs of approximately 90% for first events and 80% for recurrent events. These findings indicate that our algorithms hold promise as an instrumental tool for the consistent and reliable production of national CVD statistics in Korea.
Summary
Key Message
In this study, we developed algorithms to identify acute myocardial infarction (AMI) and stroke events from the Korean National Health insurance Service database. To validate them, we conducted retrospective review of medical records across 24 hospitals throughout Korea. The overall positive predictive values for the first and recurrent AMI events were around 92% and 78%, respectively, while those for the first and recurrent stroke events were around 88% and 81%, respectively.

Citations

Citations to this article as recorded by  
  • Incidence and case fatality rates of stroke in Korea, 2011-2020
    Jenny Moon, Yeeun Seo, Hyeok-Hee Lee, Hokyou Lee, Fumie Kaneko, Sojung Shin, Eunji Kim, Kyu Sun Yum, Young Dae Kim, Jang-Hyun Baek, Hyeon Chang Kim
    Epidemiology and Health.2023; : e2024003.     CrossRef
Original Article
Identifying pregnancy episodes and estimating the last menstrual period using an administrative database in Korea: an application to patients with systemic lupus erythematosus
Yu-Seon Jung, Yeo-Jin Song, Jihyun Keum, Ju Won Lee, Eun Jin Jang, Soo-Kyung Cho, Yoon-Kyoung Sung, Sun-Young Jung
Epidemiol Health. 2024;46:e2024012.   Published online December 19, 2023
DOI: https://doi.org/10.4178/epih.e2024012
  • 5,143 View
  • 181 Download
  • 1 Web of Science
AbstractAbstract AbstractSummary PDFSupplementary Material
Abstract
OBJECTIVES
This study developed an algorithm for identifying pregnancy episodes and estimating the last menstrual period (LMP) in an administrative claims database and applied it to investigate the use of pregnancy-incompatible immunosuppressants among pregnant women with systemic lupus erythematosus (SLE).
METHODS
An algorithm was developed and applied to a nationwide claims database in Korea. Pregnancy episodes were identified using a hierarchy of pregnancy outcomes and clinically plausible periods for subsequent episodes. The LMP was estimated using preterm delivery, sonography, and abortion procedure codes. Otherwise, outcome-specific estimates were applied, assigning a fixed gestational age to the corresponding pregnancy outcome. The algorithm was used to examine the prevalence of pregnancies and utilization of pregnancy-incompatible immunosuppressants (cyclophosphamide [CYC]/mycophenolate mofetil [MMF]/methotrexate [MTX]) and non-steroidal anti-inflammatory drugs (NSAIDs) during pregnancy in SLE patients.
RESULTS
The pregnancy outcomes identified in SLE patients included live births (67%), stillbirths (2%), and abortions (31%). The LMP was mostly estimated with outcome-specific estimates for full-term births (92.3%) and using sonography procedure codes (54.7%) and preterm delivery diagnosis codes (37.9%) for preterm births. The use of CYC/MMF/MTX decreased from 7.6% during preconception to 0.2% at the end of pregnancy. CYC/MMF/MTX use was observed in 3.6% of women within 3 months preconception and 2.5% during 0-7 weeks of pregnancy.
CONCLUSIONS
This study presents the first pregnancy algorithm using a Korean administrative claims database. Although further validation is necessary, this study provides a foundation for evaluating the safety of medications during pregnancy using secondary databases in Korea, especially for rare diseases.
Summary
Korean summary
임산부의 약물 사용 안전성에 대한 근거 제공을 위해 실제 인구집단에서의 임신 중 약물 치료 안전성을 평가하는 청구자료 기반 연구가 중요하다. 본 연구에서는 국내 청구자료에 적용할 수 있는 임신 정의 및 임신 결과 조작적 정의 알고리즘을 개발하였다. 본 알고리즘은 임신 결과 간의 우선순위를 고려한 계층 구조를 활용하며, 조기 분만 및 초음파 검사 코드 등을 통해 최종 월경 기간을 추정하였다. 또한 알고리즘을 전신홍반루푸스 환자에 적용하여 유산, 사산 등의 유병률을 산출하고 임신 중 잠재적으로 부적절한 면역억제제 사용을 파악하여 국내 청구자료의 특성을 고려한 임신 중 약물 사용 연구의 기반을 마련하였다.
Key Message
Limited safety data for pregnant women prompted recent studies on medication during pregnancy using real-world databases. This study developed a tailored algorithm for Korean healthcare claims database, employing a hierarchy of pregnancy outcomes and incorporating pre-term delivery and sonography codes for last menstrual period estimation. Applied to systemic lupus erythematosus (SLE) patients, this study presented the prevalence and drug utilization pattern of pregnancy-incompatible immunosuppressants from preconception to pregnancy end, laying a foundation for further claims database studies on medication pregnancy safety.

Epidemiol Health : Epidemiology and Health
TOP