Automated machine learning predicts liver metastases in patients with early-onset gastroenteropancreatic neuroendocrine tumors

Fuli Gao; Jian Chen; Xiaodan Xu

doi:10.21037/jgo-2024-946

Original Article

Automated machine learning predicts liver metastases in patients with early-onset gastroenteropancreatic neuroendocrine tumors

Fuli Gao, Jian Chen, Xiaodan Xu

Department of Gastroenterology, Changshu Hospital Affiliated to Soochow University, First People’s Hospital of Changshu City, Changshu, China

Contributions: (I) Conception and design: F Gao; (II) Administrative support: X Xu; (III) Provision of study materials or patients: F Gao; (IV) Collection and assembly of data: F Gao; (V) Data analysis and interpretation: F Gao, J Chen; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

Correspondence to: Xiaodan Xu, MD, PhD. Department of Gastroenterology, Changshu Hospital Affiliated to Soochow University, First People’s Hospital of Changshu City, No. 1 Shuyuan Street, Changshu 215500, China. Email: xuxiaodan20@126.com.

Background: The incidence of early-onset gastroenteropancreatic neuroendocrine tumors (GEP-NETs) is increasing, with liver metastases often occurring early and adversely affecting prognosis. This study aimed to develop a predictive model for liver metastases detection in patients with early-onset GEP-NETs (<50 years) using an automated machine learning (AutoML) approach.

Methods: A retrospective analysis was conducted on patients diagnosed with early-onset GEP-NETs [2000–2021] using data from the Surveillance, Epidemiology, and End Results (SEER) database. Patients were randomly divided into a training set (n=8,983) and a validation set (n=3,819) in a 7:3 ratio. A nomogram-based scoring system was constructed using least absolute shrinkage and selection operator (LASSO) and logistic regression. AutoML was applied to build predictive models using gradient boosting machine (GBM), generalized linear model (GLM), deep learning (DL), and distributed random forest (DRF) algorithms. Model performance was assessed using receiver operating characteristic (ROC), calibration, decision curve analysis (DCA), and interpretability tools including SHapley Additive exPlanations (SHAP), partial dependence plots (PDPs), and locally interpretable model-agnostic explanations (LIME) plots.

Results: A total of 12,802 patients were included, of whom 1,187 (9.3%) developed liver metastases, comprising 851 (9.5%) and 336 (8.8%) cases in the training and validation sets, respectively. Comparative analyses demonstrated that the AutoML models outperformed traditional logistic regression models, with the GBM algorithm achieving the highest performance. The GBM model achieved an area under the curve (AUC) of 0.961 in the training set and 0.953 in the validation set. Tumor location was identified as the most important predictor in the GBM model, followed by surgery, tumor size, chemotherapy, and T-staging.

Conclusions: The AutoML model leveraging the GBM algorithm provides a robust and clinically valuable tool for the early prediction of liver metastases in patients with early-onset GEP-NETs.

Keywords: Automated machine learning (AutoML); predictive modeling; gastroenteropancreatic neuroendocrine tumors (GEP-NETs); age; liver metastasis

Submitted Dec 03, 2024. Accepted for publication Mar 19, 2025. Published online Jun 18, 2025.

doi: 10.21037/jgo-2024-946

Highlight box

Key findings

• This study developed a predictive model for liver metastases in early-onset gastroenteropancreatic neuroendocrine tumors (GEP-NETs) using an automated machine learning (AutoML) approach. The model, particularly based on the gradient boosting machine (GBM), outperformed traditional logistic regression models. Tumor location was the most important predictor.

What is known and what is new?

• Early-onset GEP-NETs are associated with early liver metastases, impacting prognosis.

• The AutoML-based GBM model provides more accurate predictions of liver metastases, surpassing traditional models in performance.

What is the implication, and what should change now?

• The AutoML model offers a clinically valuable tool for early prediction of liver metastases in early-onset GEP-NETs, aiding in timely intervention and potentially improving patient outcomes.

Introduction

Gastroenteropancreatic neuroendocrine tumors (GEP-NETs) are highly heterogeneous neoplasms originating from neuroendocrine cells within the gastrointestinal tract and pancreas (1). Liver metastases are a common complication, with 12–74% of patients presenting with metastatic disease at diagnosis, particularly those with pancreatic neuroendocrine tumors and small intestinal neuroendocrine tumors (2-4). Liver metastasis is a critical determinant of poor prognosis in GEP-NETs. Patients without liver metastases have a 5-year survival rate exceeding 70%; however, this rate decreases dramatically in the presence of liver metastases, especially in cases of high tumor burden or poorly differentiated tumors with a high proliferation index (5). Consequently, the early prediction of liver metastasis risk is vital for optimizing clinical management and improving patient outcomes.

Although GEP-NETs are more frequently diagnosed in older individuals, typically between 60 and 70 years of age, an increasing number of cases are being identified in younger patients due to advancements in endoscopic diagnostic techniques (6,7). This upward trend aligns with global epidemiological findings that show a rising incidence of early-onset cancers (8). Early-onset GEP-NETs, defined as neoplasms occurring in patients under 50 years of age, are often associated with more aggressive subtypes, such as Grade 3 tumors or those with a Ki-67 index exceeding 20%. These characteristics contribute to an elevated risk of liver metastases and poorer clinical outcomes (9-11). These observations underscore the need for robust predictive models to assess liver metastasis risk in early-onset GEP-NETs, which could inform treatment strategies and improve survival rates. Previous studies have attempted to predict liver metastasis in specific NET subtypes. For instance, Li et al. (12) developed a nomogram incorporating variables such as tumor differentiation, tumor size, N-stage, surgery, and bone metastases for pancreatic neuroendocrine tumors. Similarly, Ding et al. (13) constructed a predictive model for liver metastasis in colorectal neuroendocrine tumors. However, these traditional nomogram-based models rely on linear regression methods, which are limited in their ability to capture complex, nonlinear relationships and optimize feature selection.

Machine learning (ML) has emerged as a powerful analytical tool for extracting insights from multidimensional medical datasets. It has increasingly been applied across various domains, including disease prevention, treatment planning, and patient monitoring, driven by advancements in big data and artificial intelligence (AI) technologies (14,15). Automated machine learning (AutoML), a subfield of ML, further simplifies and enhances the modeling process by automating tasks such as feature engineering and hyperparameter tuning, thereby improving predictive accuracy and reducing the need for expert intervention (16). AutoML frameworks enable clinicians to efficiently develop accurate models for predicting liver metastases in early-onset GEP-NETs, potentially transforming clinical decision-making. To date, no predictive model specifically addressing liver metastasis in early-onset GEP-NETs has been reported. To address this gap, this study utilized the Surveillance, Epidemiology, and End Results (SEER) database and employed the AutoML framework on the H2O platform to develop a robust prediction model. This approach aims to provide new insights and tools for the early identification of liver metastases in patients with early-onset GEP-NETs, ultimately improving patient outcomes. We present this article in accordance with the TRIPOD reporting checklist (available at https://jgo.amegroups.com/article/view/10.21037/jgo-2024-946/rc).

Methods

Study population

Patients diagnosed with GEP-NETs between January 1, 2000, and December 31, 2021, were identified using SEER*Stat 8.4.3 software and data from the SEER database {Research Data, 17 Registries, Nov 2023 Sub [2000–2021]} (https://seer.cancer.gov/). Inclusion criteria were as follows: (I) ICD-O-3 histological codes 8013, 8150–8157, 8240–8246, 8249, 8574, and 9091 (malignant); (II) tumor site codes for the esophagus (C15.0–C15.9), stomach (C16.0–C16.9), duodenum (C17.0 and C24.1), small intestine (C17.1–C17.9), colon (C18.0–C19.9), and rectum (C20.0–C20.9); and (III) age younger than 50 years. Exclusion criteria included (I) missing or unknown liver metastasis status; (II) age ≥50 years; and (III) unknown pathology. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Data collected included patient demographics (age, gender, race, marital status, and year of diagnosis), socioeconomic indicators [median household income categorized as low-income (<$50,000), middle-income ($50,000–$100,000), and high-income (>$100,000)], residential classifications (rural: adjacent or not adjacent to metropolitan areas; urban: categorized by population size), and tumor characteristics (liver metastasis, T-stage, N-stage, tumor location, differentiation, histology, size, first malignant primary, surgery, radiation therapy, and chemotherapy). Liver metastasis data have been available in the SEER database since 2010 under the variable “SEER Combined Mets at DX-liver (2010+)”. T-staging and N-staging were based on the AJCC 7th edition. A detailed study design flowchart is presented in Figure 1.

Figure 1 The flowchart of this study. DL, deep learning; DRF, distributed random forest; GBM, gradient boosting machine; GEP-NENs, gastroenteropancreatic neuroendocrine neoplasms; GLM, generalized linear model; LASSO, least absolute shrinkage and selection operator.

Multiple imputation

To address missing data, we utilized the “mice” package in R for multiple imputation. The missing rates for variables were as follows: race (3.3%), marital status (9.3%), surgery (0.3%), tumor size (16.0%), tumor grade (41.4%), rural/urban classification (0.2%), T-stage (2.4%), and N-stage (2.3%). Polyreg imputation was applied to race, grade, surgery, marital status, and T- and N-stage, while predictive mean matching (PMM) was used for tumor size, and logistic regression was applied to rural/urban classification. The consistency of the data after imputation was validated, enhancing the robustness of the analysis.

Logistic regression analysis

To address potential multicollinearity among variables, the least absolute shrinkage and selection operator (LASSO) was used for variable selection. A 10-fold cross-validation approach was employed, guided by the “λ_1se” criterion. Subsequently, binary logistic regression with backward stepwise selection was performed to further refine the variables. Independent risk factors identified in the multivariate analysis were incorporated into a nomogram to provide a user-friendly visualization of the logistic regression model. The model’s performance was assessed using the area under the receiver operating characteristic curve (AUC), calibration curves, and decision curve analysis (DCA).

AutoML

AutoML analyses were conducted using the H2O software package (h2o 3.46.0.4) on the H2O platform (www.h2o.ai). This platform integrates a variety of advanced ML algorithms, including gradient boosting machine (GBM), deep learning (DL), generalized linear model (GLM), distributed random forest (DRF), and stacked ensembles. Hyperparameter optimization was achieved through a 5-fold cross-validation grid search, with performance evaluated based on AUC metrics. To enhance the interpretability of the model and address the “black-box” nature of ML, we employed visualization techniques such as variable importance plots, SHapley Additive exPlanations (SHAP) values, partial dependence plots (PDPs), and locally interpretable model-agnostic explanations (LIME). These tools facilitated a clearer understanding of the contributions and significance of individual predictors, thereby improving the transparency and clinical utility of the model.

Statistical analysis

All statistical analyses were conducted using R software (version 4.4.2). Continuous variables were expressed as median and interquartile range [M (Q1, Q3)], and comparisons between groups were performed using the Mann-Whitney U test. Categorical variables were presented as counts (percentages), with comparisons performed using the χ² test or Fisher’s exact test. Model performance was evaluated using a confusion matrix, comprising true positive (TP), false positive (FP), false negative (FN), and true negative (TN). Metrics for model assessment included the AUC, accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and likelihood ratio (LR). Statistical significance was defined as P<0.05.

Results

Baseline patient characteristics

This study included 12,802 patients with early-onset GEP-NETs, of whom 1,187 (9.3%) presented with liver metastases. The dataset was randomly divided into training (n=8,983) and validation (n=3,819) groups in a 7:3 ratio. Liver metastases were observed in 851 patients (9.5%) in the training set and 336 patients (8.8%) in the validation set. Table S1 summarizes baseline data comparisons between the training and validation groups, while Table 1 details the clinical characteristics of patients with and without liver metastases. Patients with liver metastases were more likely to have larger tumor diameters, tumors located in the pancreas, deeper tumor stages (T3), lymph node involvement (N1), neuroendocrine carcinoma histology, and were more frequently treated with chemotherapy compared to those without liver metastases.

Table 1

Baseline characteristics of the patients

Variables	The training dataset (n=8,983)			The validation dataset (n=3,819)
	Liver metastases		P	Liver metastases		P
	No (n=8,132)	Yes (n=851)	P	No (n=3,483)	Yes (n=336)	P
Sex			<0.001			<0.001
Female	4,630 (56.9)	387 (45.5)		2,015 (57.9)	158 (47.0)
Male	3,502 (43.1)	464 (54.5)		1,468 (42.1)	178 (53.0)
Year of diagnosis			<0.001			<0.001
2010–2012	1,238 (15.2)	185 (21.7)		516 (14.8)	92 (27.4)
2013–2015	1,797 (22.1)	221 (26.0)		776 (22.3)	81 (24.1)
2016–2018	2,433 (29.9)	240 (28.2)		1,054 (30.3)	78 (23.2)
2019–2021	2,664 (32.8)	205 (24.1)		1,137 (32.6)	85 (25.3)
Race			0.17			0.60
American Indian/Alaska Native	70 (0.9)	6 (0.7)		36 (1.0)	1 (0.3)
Asian or Pacific Islander	645 (7.9)	67 (7.9)		266 (7.6)	28 (8.3)
Black	1,203 (14.8)	150 (17.6)		526 (15.1)	48 (14.3)
White	6,214 (76.4)	628 (73.8)		2,655 (76.2)	259 (77.1)
Site			<0.001			<0.001
Appendix	3,208 (39.4)	6 (0.7)		1,338 (38.4)	2 (0.6)
Colon	336 (4.1)	117 (13.7)		140 (4.0)	41 (12.2)
Duodenum	333 (4.1)	22 (2.6)		168 (4.8)	7 (2.1)
Esophagus	11 (0.1)	10 (1.2)		3 (0.1)	1 (0.3)
Pancreas	1,044 (12.8)	383 (45.0)		455 (13.1)	160 (47.6)
Rectum	1,807 (22.2)	58 (6.8)		739 (21.2)	13 (3.9)
Small intestine	740 (9.1)	217 (25.5)		350 (10.0)	89 (26.5)
Stomach	653 (8.0)	38 (4.5)		290 (8.3)	23 (6.8)
Grade			<0.001			<0.001
Well	6,828 (84.0)	412 (48.4)		2,948 (84.6)	174 (51.8)
Moderately	1,055 (13.0)	181 (21.3)		440 (12.6)	83 (24.7)
Poorly	249 (3.1)	258 (30.3)		95 (2.7)	79 (23.5)
Histology			<0.001			<0.001
Atypical carcinoid tumor	320 (3.9)	87 (10.2)		120 (3.4)	25 (7.4)
Carcinoid tumor	6,309 (77.6)	313 (36.8)		2,789 (80.1)	144 (42.9)
Goblet cell carcinoid	242 (3.0)	1 (0.1)		87 (2.5)	1 (0.3)
MANEC	88 (1.1)	13 (1.5)		30 (0.9)	4 (1.2)
Neuroendocrine carcinoma	1,072 (13.2)	388 (45.6)		426 (12.2)	151 (44.9)
Others	101 (1.2)	49 (5.8)		31 (0.9)	11 (3.3)
Surgery			<0.001			<0.001
No	834 (10.3)	507 (59.6)		381 (10.9)	181 (53.9)
Yes	7,298 (89.7)	344 (40.4)		3,102 (89.1)	155 (46.1)
Radiation			<0.001			<0.001
No/unknown	8,058 (99.1)	776 (91.2)		3,462 (99.4)	306 (91.1)
Yes	74 (0.9)	75 (8.8)		21 (0.6)	30 (8.9)
Chemotherapy			<0.001			<0.001
No/unknown	7,835 (96.3)	436 (51.2)		3,356 (96.4)	188 (56.0)
Yes	297 (3.7)	415 (48.8)		127 (3.6)	148 (44.0)
First malignant primary			0.16			0.98
No	530 (6.5)	45 (5.3)		240 (6.9)	23 (6.8)
Yes	7,602 (93.5)	806 (94.7)		3,243 (93.1)	313 (93.2)
Marital			<0.001			0.06
Divorced	460 (5.7)	56 (6.6)		188 (5.4)	18 (5.4)
Married	4,048 (49.8)	503 (59.1)		1,751 (50.3)	198 (58.9)
Separated	112 (1.4)	9 (1.1)		49 (1.4)	4 (1.2)
Single	3,383 (41.6)	265 (31.1)		1,440 (41.3)	114 (33.9)
Unmarried	86 (1.1)	11 (1.3)		37 (1.1)	1 (0.3)
Widowed	43 (0.5)	7 (0.8)		18 (0.5)	1 (0.3)
Rural/urban			0.64			0.59
Rural	815 (10.0)	81 (9.5)		371 (10.7)	39 (11.6)
Urban	7,317 (90.0)	770 (90.5)		3,112 (89.3)	297 (88.4)
Household income			0.09			0.85
Low	517 (6.4)	66 (7.8)		232 (6.7)	25 (7.4)
Median	6,145 (75.6)	651 (76.5)		2,624 (75.3)	250 (74.4)
High	1,470 (18.1)	134 (15.7)		627 (18.0)	61 (18.2)
T			<0.001			<0.001
Tis	29 (0.4)	0 (0.0)		16 (0.5)	0 (0.0)
T0	6 (0.1)	7 (0.8)		3 (0.1)	4 (1.2)
T1	4,729 (58.2)	34 (4.0)		2,059 (59.1)	14 (4.2)
T2	1,009 (12.4)	169 (19.9)		420 (12.1)	63 (18.8)
T3	904 (11.1)	248 (29.1)		400 (11.5)	111 (33.0)
T4	349 (4.3)	174 (20.4)		157 (4.5)	68 (20.2)
Tx	1,106 (13.6)	219 (25.7)		428 (12.3)	76 (22.6)
N			<0.001			<0.001
N0	6,287 (77.3)	277 (32.5)		2,666 (76.5)	108 (32.1)
N1	1,061 (13.0)	443 (52.1)		488 (14.0)	179 (53.3)
Nx	784 (9.6)	131 (15.4)		329 (9.4)	49 (14.6)
Size (mm)	10.0 (5.0, 18.0)	40.0 (25.0, 60.0)	<0.001	10.0 (5.0, 18.0)	39.0 (25.0, 60.0)	<0.001

Data are presented as n (%) or median (interquartile range). MANEC, mixed adenoneuroendocrine carcinoma; N, node; T, tumor.

Logistic regression model

LASSO regression was used to address multicollinearity among 16 predictor variables, with a 5-fold cross-validation approach and “λ.1se (0.022)” as the selection criterion (Figure S1). After univariate analysis, five variables were selected: tumor grade, surgery, T-stage, N-stage, and tumor size. Binary logistic regression with backward stepwise selection excluded T-stage (P=0.28), resulting in four independent predictors: tumor grade, surgery, N-stage, and tumor size. These variables were incorporated into a nomogram (Figure 2). Model performance was robust, with AUCs of 0.919 and 0.904 for the training and validation sets, respectively. Figure 3 illustrates the ROC, calibration, and DCA results. Calibration curves demonstrated mean absolute errors of 0.013 and 0.040 for the training and validation sets, respectively, indicating a moderate discrepancy between predicted and actual risks (Hosmer-Lemeshow P<0.001). DCA curves showed that for threshold probabilities of liver metastasis between 2% and 70%, using the logistic regression model for intervention could achieve up to a 7% net clinical benefit.

Figure 2 Nomogram of logistic regression model for predicting liver metastasis in patients with early-onset gastroenteropancreatic neuroendocrine tumors. N, node.

Figure 3 The ROC curves (A,D), calibration curves (B,E), and decision curves (C,F) of the logistic regression model in the training and validation sets. AUC, area under the curve; CI, confidence interval; ROC, receiver operating characteristic

AutoML

All clinical data were preprocessed and analyzed using the H2O platform’s AutoML framework, which automated variable selection and model parameterization. A total of 61 models were generated, spanning five ML algorithms: GBM, DL, GLM, DRF, and Stacked Ensembles. Modeling time was capped at 300 seconds. The best-performing model was GBM (ID: GBM_grid_1_AutoML_1_20241028_102000_model_3), which achieved a Gini value of 0.922, an R² of 0.503, and a LogLoss of 0.140. In the training set, the GBM model outperformed other models with an AUC of 0.961, compared to GLM (0.954), DRF (0.954), and DL (0.956). Similarly, in the validation set, GBM exhibited the highest AUC (0.953), followed by DL (0.945), DRF (0.948), and GLM (0.943). Table 2 highlights the superior AUC and accuracy metrics of all AutoML-generated models (GBM, GLM, DRF, and DL) compared to the logistic regression model. This advantage persisted in both training and validation sets. Notably, the Stacked Ensemble model was excluded from the analysis due to its poor interpretability. The results underscore the enhanced predictive performance of ML models, particularly GBM, over traditional logistic regression for assessing the risk of liver metastasis in early-onset GEP-NET patients.

Table 2

Performance of the machine learning models in the training and validation group

Model	AUC	Sensitivity	Specificity	Accuracy	PPV	NPV	LR+	LR−
Train set
AutoML
GBM	0.961	0.689	0.966	0.940	0.679	0.967	20.216	0.322
DRF	0.954	0.718	0.958	0.935	0.642	0.970	17.122	0.294
GLM	0.954	0.725	0.954	0.932	0.621	0.971	15.681	0.288
DL	0.956	0.704	0.957	0.933	0.632	0.969	16.401	0.309
Logistic regression
LASSO	0.919	0.784	0.913	0.797	0.989	0.307	9.011	0.237
Validation set
AutoML
GBM	0.953	0.661	0.958	0.932	0.605	0.967	15.871	0.354
DRF	0.948	0.646	0.963	0.935	0.627	0.966	17.438	0.368
GLM	0.943	0.643	0.959	0.931	0.602	0.965	15.658	0.372
DL	0.945	0.652	0.959	0.927	0.605	0.966	15.875	0.363
Logistic regression
LASSO	0.904	0.769	0.887	0.779	0.986	0.270	7.000	0.260

AUC, area under the curve; AutoML, automated machine learning; DL, deep learning; DRF, distributed random forest; GBM, gradient boosting machine; GLM, generalized linear model; LASSO, least absolute shrinkage and selection operator; LR+, positive likelihood ratio; LR−, negative likelihood ratio; NPV, negative predictive value; PPV, positive predictive value.

Interpretability analysis based on the best model (GBM)

The GBM model identified tumor site as the most important variable influencing liver metastasis predictions, followed by surgery, tumor size, chemotherapy, T-stage, and N-stage (Figure 4A). The SHAP feature plots in Figure 4B confirm these findings, showing that tumor site, tumor size, T-stage, surgery, N-stage, and chemotherapy have substantial impacts on model predictions. Features ranked higher in the plot play a more significant role in predicting liver metastasis. For instance, the SHAP plot for tumor site reveals that higher normalized values (red dots) are associated with an increased risk of liver metastasis, while lower values (blue dots) correlate with reduced risk. Regarding surgery, the SHAP values indicate a protective effect: blue (low SHAP values) represents lower risk, reflecting the benefits of surgical intervention in reducing liver metastasis, while red (high SHAP values) corresponds to increased risk in the absence of surgery. PDPs further illuminate these relationships, showing a positive trend between tumor size and liver metastasis risk. Additionally, patients with specific tumor locations (e.g., small bowel or pancreatic neuroendocrine tumors), advanced T4 or N1 staging, no surgery, and chemotherapy are more likely to develop liver metastases (Figure S2). Figure 5 visualizes the interpretability of the GBM model using the LIME algorithm. For example, in a validation sample (P1 = liver metastasis, P0 = no liver metastasis), the GBM model predicted a 90% probability of liver metastasis for one patient, where chemotherapy emerged as the most influential predictor, followed by surgery, tumor size, and tumor site. In contrast, factors such as N-stage, radiotherapy, and tumor grade were associated with a reduced likelihood of metastasis.

Figure 4 Variable importance and SHAP of the GBM model in the training cohort. Panel (A) shows that tumor location is the most important feature. As illustrated in (B), when the variable value approaches 1, the likelihood of liver metastasis increases for the patient. GBM, gradient boosting machine; N, node; SHAP, SHapley Additive exPlanations; T, tumor.

Figure 5 LIME visualization of variable importance in a random sample from the validation cohort (demonstrating the impact of key variables on individual predictions; p0 represents no liver metastasis, and p1 represents liver metastasis). DX, diagnosis; LIME, locally interpretable model-agnostic explanations; N, node; T, tumor.

Discussion

The rapid development of ML algorithms has revolutionized the establishment of AI models, posing significant demands on the modeler’s expertise and technical skills in areas like model selection, feature extraction, and hyperparameter optimization (17). In response to these challenges, several technology companies have developed automated learning frameworks such as H2O’s AutoML and Google’s Cloud AutoML (18). These tools greatly simplify the preliminary stages of ML development, including data preprocessing, feature selection, and environment setup, as well as automating algorithm selection, optimization, and hyperparameter tuning. This automation significantly enhances modeling efficiency. In this study, we employed both AutoML and traditional logistic regression approaches to develop a model for the early prediction of liver metastasis in patients with early-onset GEP-NETs. AutoML demonstrated superior efficiency and accuracy compared to univariate and multivariate logistic regression analyses. Specifically, the models generated using AutoML, including GBM, GLM, DRF, and DL models, outperformed traditional logistic regression in both performance and predictive accuracy. Among these, the GBM model exhibited the best results, achieving an AUC of 0.961 and accuracy of 0.940 in the training set, and an AUC of 0.953 and accuracy of 0.932 in the validation set.

Other studies corroborate the critical role of ML algorithms in predictive modeling. For instance, one research effort employed seven ML methods to extract gene features from RNA sequencing (RNA-Seq) datasets of pancreatic and small bowel NET tissue samples, achieving accuracies of 98.4% and 87.4% in the training and test cohorts, respectively (19). Another study combined computational pathology scores and DL radiomics to predict postoperative liver metastasis in pancreatic NET patients, demonstrating robust model performance (20). Traditional logistic regression has also been used to develop predictive models, such as the nomogram for liver metastasis in pancreatic NET patients created by Pan et al. using univariate and multivariate analyses (21). Nomograms are advantageous due to their straightforward visualization, allowing clinicians to calculate risk scores intuitively and facilitating communication with patients. However, logistic regression models are limited in their ability to handle high-dimensional data and complex covariate interactions. ML models, on the other hand, excel in managing high-dimensional datasets and capturing nonlinear relationships, making them adept at identifying complex feature interactions. Despite these strengths, ML approaches have limitations, particularly in interpretability. The “black-box” nature of these models can hinder clinical adoption, as the underlying basis of predictions is often not easily understood by clinicians.

To address the “black box” effect of ML, we utilized various visualization techniques to elucidate the role of key variables in both overall model predictions and individual prediction processes. Analysis of variable importance revealed that in the GBM model, the top three predictors were tumor site, surgery, and tumor size, with tumor site being the most critical factor in predicting liver metastasis in patients with early-onset GEP-NENs. Previous studies have demonstrated that tumors originating from different primary sites exhibit distinct biological behaviors and metastatic tendencies. Small bowel and pancreatic NETs have the highest risk of liver metastasis. For instance, a European epidemiological survey reported that small intestinal NETs accounted for 56% of liver metastases in neuroendocrine neoplasms (NENs) (2). Similarly, a national survey in Japan found that 23.2% of pancreatic NETs presented with distant metastases, including liver metastases, at diagnosis (22). Our study also observed the highest proportions of hepatic metastases in pancreatic and small intestinal NETs, at 45.7% and 25.8%, respectively. These findings emphasize the importance of heightened vigilance for liver metastasis in cases involving primary tumors in the pancreas and small intestine.

In addition to tumor site, tumor size and surgery emerged as significant predictors of liver metastasis, consistently identified by both the GBM model and logistic regression. Tumor size has been strongly associated with metastasis risk. For example, gastric NETs with diameters >2 cm, particularly G2 or G3 grade tumors, exhibit a markedly increased likelihood of liver metastasis (23). In small intestinal NETs, tumor size ≥1 cm is considered a high-risk factor for liver metastasis, with larger tumors often linked to higher rates of lymph node and distant metastases (24). For pancreatic NETs, the rate of liver metastasis increases from approximately 10% for tumors <2 cm in diameter to over 50% for those ≥4 cm (25). The impact of tumor size on metastasis is likely twofold: larger tumors may exhibit greater invasiveness, enabling them to breach tissue barriers and enter blood or lymphatic vessels, and they may possess enhanced proliferative capacity and angiogenesis, facilitating distant spread (26).

Surgery plays a dual role as both a predictor and a treatment modality for liver metastasis in patients with GEP-NENs. Studies indicate that surgical resection of small bowel NETs with tumor diameters <2 cm and no lymph node involvement significantly reduces the risk of liver metastasis (27). Similarly, resecting the primary tumor can slow disease progression and lower the risk of secondary metastases, particularly in pancreatic NETs (28). For limited GEP-NETs, timely removal of the primary tumor and regional lymph nodes reduces tumor burden, interrupts metastatic pathways, and improves overall survival (29). Surgery is also critical in managing patients with liver metastases, especially those with resectable primary and metastatic foci. It not only enhances survival outcomes but also improves quality of life by reducing tumor burden and alleviating symptoms. Evidence suggests that the 5-year survival rate for patients undergoing complete resection of primary and metastatic tumors can reach 50–80%, substantially higher than that of patients receiving only non-surgical treatments (30). While surgical intervention offers numerous benefits for patients with GEP-NETs, controversy persists regarding the resection of the primary tumor site, necessitating further investigation. For instance, a prospective, non-randomized, international, multicenter cohort study (NCT03084770) has shown that, for sporadic, asymptomatic non-functioning pancreatic NENs smaller than 2 cm, active surveillance is often favored over aggressive surgical management (31).

Despite the promising performance and interpretability of the AutoML-based prediction model, there are certain limitations in this study. First, as a retrospective analysis using the SEER database, issues such as missing data and potential bias are unavoidable, although multiple imputation was employed to mitigate these effects. Second, the study’s population is specific to the United States, and while a validation cohort was used, additional external validation is necessary to generalize the findings to other populations or regions. Third, the absence of key markers such as the Ki-67 proliferation index and mitotic rates in the SEER database may have impacted predictive accuracy. Lastly, approximately 40% of the cohort had missing data on tumor differentiation, potentially introducing bias despite imputation efforts. Future research should aim to integrate comprehensive molecular data and validate the model across multicenter cohorts.

In summary, this study employed AutoML to create an effective predictive model for the early detection of liver metastases in early-onset gastroenteropancreatic NENs. The GBM model outperformed traditional logistic regression and other AutoML models (e.g., GLM, DL, DRF), showcasing AutoML’s ability to overcome limitations of conventional methods.

Conclusions

The GBM-based AutoML model provides a reliable, user-friendly tool for early detection, with strong potential to enhance personalized healthcare and optimize resource use in clinical settings.

Acknowledgments

The authors express their gratitude for the valuable efforts undertaken by the SEER Program in establishing and maintaining the SEER database.

Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://jgo.amegroups.com/article/view/10.21037/jgo-2024-946/rc

Peer Review File: Available at https://jgo.amegroups.com/article/view/10.21037/jgo-2024-946/prf

Funding: This study was supported by Science and technology project of Changshu Health Committee (No. CSWS202014), and Suzhou 23rd Science and Technology Development Program Project (Clinical Trial Organization Capacity Enhancement) (No. SLT2023006).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://jgo.amegroups.com/article/view/10.21037/jgo-2024-946/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Fernandes CJ, Leung G, Eads JR, et al. Gastroenteropancreatic Neuroendocrine Tumors. Gastroenterol Clin North Am 2022;51:625-47. [Crossref] [PubMed]
Riihimäki M, Hemminki A, Sundquist K, et al. The epidemiology of metastases in neuroendocrine tumors. Int J Cancer 2016;139:2679-86. [Crossref] [PubMed]
Tran CG, Sherman SK, Chandrasekharan C, et al. Surgical Management of Neuroendocrine Tumor Liver Metastases. Hematol Oncol Clin North Am 2025;39:37-53. [Crossref] [PubMed]
Xu Q, Yin B, Han X, et al. Long-term Outcomes of Surgical Treatment for Pancreatic Neuroendocrine Neoplasm With Synchronous Hepatic Metastasis: A Multicenter Retrospective Cohort Study. Pancreas 2025;54:e179-87. [Crossref] [PubMed]
Sharma A, Muralitharan M, Ramage J, et al. Current Management of Neuroendocrine Tumour Liver Metastases. Curr Oncol Rep 2024;26:1070-84. [Crossref] [PubMed]
Nguyen AH, O'Leary MP, De Andrade JP, et al. Presentation and survival of gastro-entero-pancreatic neuroendocrine tumors in young adults versus older patients. Am J Surg 2022;223:939-44. [Crossref] [PubMed]
Lee MR, Harris C, Baeg KJ, et al. Incidence Trends of Gastroenteropancreatic Neuroendocrine Tumors in the United States. Clin Gastroenterol Hepatol 2019;17:2212-2217.e1. [Crossref] [PubMed]
Ugai T, Sasamoto N, Lee HY, et al. Is early-onset cancer an emerging global epidemic? Current evidence and future implications. Nat Rev Clin Oncol 2022;19:656-73. [Crossref] [PubMed]
Cazzato RL, Hubelé F, De Marini P, et al. Liver-Directed Therapy for Neuroendocrine Metastases: From Interventional Radiology to Nuclear Medicine Procedures. Cancers (Basel) 2021;13:6368. [Crossref] [PubMed]
Trikalinos NA, Tan BR, Amin M, et al. Effect of metastatic site on survival in patients with neuroendocrine neoplasms (NENs). An analysis of SEER data from 2010 to 2014. BMC Endocr Disord 2020;20:44. [Crossref] [PubMed]
Yao H, Hu G, Jiang C, et al. Epidemiologic trends and survival of early-onset gastroenteropancreatic neuroendocrine neoplasms. Front Endocrinol (Lausanne) 2023;14:1241724. [Crossref] [PubMed]
Li J, Huang L, Liao C, et al. Two machine learning-based nomogram to predict risk and prognostic factors for liver metastasis from pancreatic neuroendocrine tumors: a multicenter study. BMC Cancer 2023;23:529. [Crossref] [PubMed]
Ding X, Tian S, Hu J, et al. Risk and prognostic nomograms for colorectal neuroendocrine neoplasm with liver metastasis: a population-based study. Int J Colorectal Dis 2021;36:1915-27. [Crossref] [PubMed]
Leite D, Martins A Jr, Rativa D, et al. An Automated Machine Learning Approach for Real-Time Fault Detection and Diagnosis. Sensors (Basel) 2022;22:6138. [Crossref] [PubMed]
Puri M. Automated Machine Learning Diagnostic Support System as a Computational Biomarker for Detecting Drug-Induced Liver Injury Patterns in Whole Slide Liver Pathology Images. Assay Drug Dev Technol 2020;18:1-10. [Crossref] [PubMed]
LeDell E, Poirier S. editors. H2o automl: Scalable automatic machine learning. ICML San Diego, CA, USA: Proceedings of the AutoML Workshop at ICML; 2020.
Handelman GS, Kok HK, Chandra RV, et al. eDoctor: machine learning and the future of medicine. J Intern Med 2018;284:603-19. [Crossref] [PubMed]
Zeng Y, Zhang J. A machine learning model for detecting invasive ductal carcinoma with Google Cloud AutoML Vision. Comput Biol Med 2020;122:103861. [Crossref] [PubMed]
Padwal MK, Basu S, Basu B. Application of Machine Learning in Predicting Hepatic Metastasis or Primary Site in Gastroenteropancreatic Neuroendocrine Tumors. Curr Oncol 2023;30:9244-61. [Crossref] [PubMed]
Ma M, Gu W, Liang Y, et al. A novel model for predicting postoperative liver metastasis in R0 resected pancreatic neuroendocrine tumors: integrating computational pathology and deep learning-radiomics. J Transl Med 2024;22:768. [Crossref] [PubMed]
Pan M, Yang Y, Teng T, et al. Development and validation of a simple-to-use nomogram to predict liver metastasis in patients with pancreatic neuroendocrine neoplasms: a large cohort study. BMC Gastroenterol 2021;21:101. [Crossref] [PubMed]
Masui T, Ito T, Komoto I, et al. Recent epidemiology of patients with gastro-entero-pancreatic neuroendocrine neoplasms (GEP-NEN) in Japan: a population-based study. BMC Cancer 2020;20:1104. [Crossref] [PubMed]
Lamberti G, Panzuto F, Pavel M, et al. Gastric neuroendocrine neoplasms. Nat Rev Dis Primers 2024;10:25. [Crossref] [PubMed]
Ye X, Wang L, Xing Y, et al. Frequency, prognosis and treatment modalities of newly diagnosed small bowel cancer with liver metastases. BMC Gastroenterol 2020;20:342. [Crossref] [PubMed]
Howe JR, Merchant NB, Conrad C, et al. The North American Neuroendocrine Tumor Society Consensus Paper on the Surgical Management of Pancreatic Neuroendocrine Tumors. Pancreas 2020;49:1-33. [Crossref] [PubMed]
Harrelson A, Wang R, Stewart A, et al. Management of neuroendocrine tumor liver metastases. Am J Surg 2023;226:623-30. [Crossref] [PubMed]
Niederle B, Selberherr A, Niederle MB. How to Manage Small Intestine (Jejunal and Ileal) Neuroendocrine Neoplasms Presenting with Liver Metastases? Curr Oncol Rep 2021;23:85. [Crossref] [PubMed]
Butz F, Dukaczewska A, Jann H, et al. Surgical Approach to Liver Metastases in GEP-NET in a Tertiary Reference Center. Cancers (Basel) 2023;15:2048. [Crossref] [PubMed]
Zheng M, Li Y, Li T, et al. Resection of the primary tumor improves survival in patients with gastro-entero-pancreatic neuroendocrine neoplasms with liver metastases: A SEER-based analysis. Cancer Med 2019;8:5128-36. [Crossref] [PubMed]
Pu N, Habib JR, Bejjani M, et al. The effect of primary site, functional status and treatment modality on survival in gastroenteropancreatic neuroendocrine neoplasms with synchronous liver metastasis: a US population-based study. Ann Transl Med 2021;9:329. [Crossref] [PubMed]
Partelli S, Massironi S, Zerbi A, et al. Management of asymptomatic sporadic non-functioning pancreatic neuroendocrine neoplasms no larger than 2 cm: interim analysis of prospective ASPEN trial. Br J Surg 2022;109:1186-90. [Crossref] [PubMed]

Cite this article as: Gao F, Chen J, Xu X. Automated machine learning predicts liver metastases in patients with early-onset gastroenteropancreatic neuroendocrine tumors. J Gastrointest Oncol 2025;16(3):937-949. doi: 10.21037/jgo-2024-946

Automated machine learning predicts liver metastases in patients with early-onset gastroenteropancreatic neuroendocrine tumors

Highlight box

Introduction

Methods

Study population

Multiple imputation

Logistic regression analysis

AutoML

Statistical analysis

Results

Baseline patient characteristics

Table 1

Logistic regression model

AutoML

Table 2

Interpretability analysis based on the best model (GBM)

Discussion

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share