Reading Tools

How to cite item

Machine learning-based gastric cancer risk prediction in an asymptomatic screening population: a retrospective cohort study

@article{JGO120153,
	author = {Ji Hyun Song and San Wang and Young Sun Kim and Sun Young Yang and Hae Yeon Kang and Marharyta Kurban and Sanghee Lim and Jeong Yoon Yim},
	title = {Machine learning-based gastric cancer risk prediction in an asymptomatic screening population: a retrospective cohort study},
	journal = {Journal of Gastrointestinal Oncology},
	volume = {17},
	number = {3},
	year = {2026},
	keywords = {},
	abstract = {Background: Early detection of gastric cancer is critical for improving survival. Although established risk factors are known, few studies comprehensively assess individual gastric cancer risk across diverse clinical features, particularly within a general screening population. Existing machine learning (ML) models often rely on high-risk clinical cohorts or utilize classification methods that fail to account for the essential time-to-event nature of survival data, thus limiting their utility for developing personalized, long-term screening strategies in asymptomatic individuals. This study aimed to develop and evaluate clinically explainable survival-based ML models for personalized gastric cancer risk stratification during longitudinal follow-up using data from a large cohort of asymptomatic individuals undergoing routine health screening.Methods: Comprehensive medical annual check-up data, including endoscopic findings and blood test results, were collected from 129,223 patients who visited one of the largest medical screening facilities in South Korea between 2007 and 2020. We trained several survival-based ML models [e.g., Extreme Gradient Boosting (XGBoost) Survival, DeepSurv, Random Survival Forest] as well as a conventional Cox Proportional Hazards (CPH) regression model. Model behavior was interpreted using SHapley Additive exPlanations (SHAP).Results: Survival-based ML models demonstrated comparable discrimination performance, with the XGBoost Survival model achieving an average concordance index of 0.78. Helicobacter pylori infection, chronic atrophic gastritis, and intestinal metaplasia were the most significant risk factors contributing to cancer development. Model explanations derived from SHAP analysis were well aligned with established clinical knowledge.Conclusions: This study demonstrates the feasibility of applying explainable survival-based ML approaches for gastric cancer risk stratification in a general screening population. While the findings suggest potential clinical relevance, further calibration assessment, external validation, and prospective evaluation are required before translation into clinical decision-making or surveillance strategy development.},
	issn = {2219-679X},	url = {https://jgo.amegroups.org/article/view/120153}
}

JGO Reading Tools

How to cite item

Machine learning-based gastric cancer risk prediction in an asymptomatic screening population: a retrospective cohort study