Bioinformatics analysis of TCGA data identifies a taurine metabolism-related subtype classification for predicting prognosis in colon adenocarcinoma
Original Article

Bioinformatics analysis of TCGA data identifies a taurine metabolism-related subtype classification for predicting prognosis in colon adenocarcinoma

Xinping Sun#, Juhua Dai#, Bozhi Lin, Yujing Sun, Liyuan Chen

Department of Laboratory Medicine, Peking University International Hospital, Beijing, China

Contributions: (I) Conception and design: X Sun, L Chen; (II) Administrative support: L Chen; (III) Provision of study materials or patients: J Dai, B Lin, Y Sun; (IV) Collection and assembly of data: J Dai, B Lin; (V) Data analysis and interpretation: X Sun, L Chen; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

#These authors contributed equally to this work.

Correspondence to: Liyuan Chen, MD. Department of Laboratory Medicine, Peking University International Hospital, No. 1 Life Park Road, Zhongguancun Life Science Park, Changping District, Beijing 102206, China. Email: chenliyuan@pkuih.edu.cn.

Background: The increasing incidence and mortality of colon adenocarcinoma (COAD) underscore the urgent clinical need for improved prognostic biomarkers. Current prognostic models often lack precision, highlighting the necessity for research focused on molecular signatures derived from The Cancer Genome Atlas (TCGA). This study aims to address this gap by utilizing TCGA data to identify a robust prognosis prediction model.

Methods: RNA-sequencing datasets and information on the clinical features of COAD patients were sourced from the TCGA database. Non-negative matrix factorization (NMF) was applied to TCGA-COAD cohort to identify the molecular subtypes associated with taurine metabolism. A comparative analysis was conducted to evaluate immune infiltration and survival outcomes across the identified subtypes. Subsequently, we evaluated prognosis outcomes, specifically survival rates and recurrence, using Kaplan-Meier analysis. The prediction model was developed using a training set derived from TCGA data, employing least absolute shrinkage and selection operator (LASSO) regression and multivariate Cox regression techniques.

Results: The analysis identified a total of 597 genes with prognostic significance in COAD, among which several taurine metabolism-related genes were identified, including HSPB1, NOS2, LEP, KPNA2, SERPINA1, NR1H2, ENO2, HSPA1A, TRPV1, GSR, ALOX12, GABRD, TERT, CLCN3, AGMAT, NOTCH3, and MYB. Based on the expression profiles linked to the taurine metabolism-related genes, the NMF algorithm successfully classified patients from the TCGA-COAD cohort into two distinct expression clusters: cluster 1 (C1) and cluster 2 (C2). To examine the underlying mechanisms differentiating these two clusters, 199 differentially expressed genes (DEGs) were identified. A Gene Ontology (GO) analysis of these DEGs revealed that they were primarily engaged in biological processes such as extracellular matrix (ECM) organization, collagen fibril organization, and cell-substrate adhesion. Notably, disparities in immune activity were observed between the two taurine metabolism-related clusters in COAD. The cancer stem cell (CSC) scores of the patients in C1 in the TCGA-COAD cohort were significantly higher than those of the patients in C2. Further investigations using the LASSO and Cox regression methods led to the identification of 17 genes implicated in taurine metabolism associated with COAD. Subsequently, a prognostic model comprising nine genes (i.e., LEP, SERPINA1, ENO2, HSPA1A, GSR, GABRD, TERT, NOTCH3, and MYB) was developed to predict the prognosis of COAD patients. Furthermore, the efficacy of the prognostic model was evaluated via a receiver operating characteristic curve analysis, which revealed area under the curve values of 0.698 for 1 year, 0.699 for 3 years, and 0.73 for 5 years.

Conclusions: The findings of this study have significant clinical implications, suggesting that our nine-gene prognostic model could be integrated into routine clinical practice to enhance patient stratification and inform treatment decisions for COAD. Future research should focus on prospective validation and exploration of therapeutic targets within the identified genes.

Keywords: Colon adenocarcinoma (COAD); prognosis; taurine metabolism; taurine metabolism-related signature; biomarkers


Submitted Jul 29, 2025. Accepted for publication Sep 29, 2025. Published online Oct 24, 2025.

doi: 10.21037/jgo-2025-605


Highlight box

Key findings

• Taurine metabolism is progressively associated with the progression of colon adenocarcinoma (COAD); thus, it could serve as a dependable prognostic indicator.

What is known, and what is new?

• Evaluating the genes associated with taurine metabolism in COAD is worthwhile.

• This research delineated distinct subtypes of COAD and identified biomarkers linked to taurine metabolic processes.

What is the implication, and what should change now?

• Our predictive model for COAD, which is based on the expression of genes associated with taurine metabolism, could aid in the development of novel targeted therapies.


Introduction

Colorectal cancer (CRC) is the third most common cancer and the second leading cause of cancer-related deaths worldwide (1). Colon adenocarcinoma (COAD), a type of CRC, poses a major health risk due to its high incidence and mortality rates (1,2). Depending on the disease stage and characteristics, COAD treatment usually includes surgery, chemotherapy, and/or radiotherapy. Despite advances in the screening and treatment of CRC, patient prognosis remains poor, and about 50% of patients experience recurrence and metastasis (2). Clinical staging primarily guides treatment; however, genetic heterogeneity, marked by genomic instability, also affects outcomes and contributes to drug resistance (3). Given the significant genetic diversity of patients and the need for personalized treatments, CRC molecular subtypes need to be closely examined to identify new biomarkers to improve patient prognosis and optimize treatment.

Several prognostic biomarkers and prediction models have been proposed to augment conventional staging systems. For instance, gene expression signatures derived from pathways such as epithelial-mesenchymal transition (EMT), immune microenvironment, and metabolic reprogramming have shown promise in predicting clinical outcomes. Nomograms integrating molecular and clinical variables have also been developed to provide individualized survival estimates. Examples include the model by Zhu et al. (4) for early-onset stage II–III colon cancer and the nomogram by Zheng and Sun (5) predicting perineural invasion risk and its prognostic implications. Despite these advances, the accuracy and generalizability of existing models remain suboptimal, indicating a continued demand for more robust and biologically relevant prognostic biomarkers.

Taurine metabolism has been linked to the development of COAD (6). This sulfur-containing β-amino acid is involved in various cellular functions, such as osmoregulation and antioxidation (6). Recent research suggests that taurine may play a role in CRC, and its elevated levels in CRC patients indicate its potential as a diagnostic biomarker (6). Taurine levels can be used to distinguish between benign and malignant growths, enhancing screening accuracy. Additionally, in a colon cancer rat model, taurine was shown to improve the effectiveness of the chemotherapy drug 5-fluorouracil by reducing side effects and enhancing treatment outcomes (7).

Taurine metabolism is closely linked with other cancer-related metabolic pathways, and thus has diagnostic and therapeutic potential. Research on taurine synthesis in pancreatic cancer has revealed its role in recurrence and survival, and suggests that it may have similar mechanisms in COAD, in which it may affect tumor progression and prognosis (8). Additionally, the relationship between taurine and bile acid metabolism in COAD has led to the development of a prognostic model based on bile acid metabolism-related genes. This model emphasizes the complex metabolic interactions in cancer progression and the possibility of targeting these pathways for treatment (9). Research has examined the role of taurine in influencing sphingolipid metabolism, a key factor in CRC affecting cell survival and growth (10). The impact of taurine on this pathway may reveal new cancer treatment targets. Overall, research has highlighted the importance of taurine in COAD, suggesting that an understanding of its role could lead to innovative diagnostic and therapeutic approaches that enhance patient outcomes (6-8).

This study aimed to identify clinically relevant subtypes of COAD by examining the genes associated with taurine metabolism. Furthermore, it sought to characterize these subtypes based on their clinical prognostic outcomes, tumor microenvironment features, immune cell infiltration patterns, sensitivity to chemotherapy, and underlying functional mechanisms. We present this article in accordance with the TRIPOD reporting checklist (available at https://jgo.amegroups.com/article/view/10.21037/jgo-2025-605/rc).


Methods

Acquisition and preprocessing of publicly accessible cohort data

The gene sets associated with taurine metabolism were obtained from previous research (11). A total of 454 COAD samples were sourced from The Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer.gov/), which included pertinent transcript per million RNA-sequencing data alongside relevant clinical metadata. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Clustering analysis of COAD based on taurine metabolism-related genes

Using the “non-negative matrix factorization” (“NMF”) package in R, the COAD patients were categorized into groups based on the expression profiles of the genes associated with taurine metabolism. Following the determination of the optimal number of clusters, designated as k, the clustering process was repeated 1,000 times to ensure the establishment of a stable and reliable consensus matrix. The silhouette width values ranged from −1 to 1, such that a higher value approaching 1 indicated enhanced separation and cohesion among the clusters. A principal component analysis (PCA) was conducted to examine the distributional differences among the various subtypes. This analysis was performed using the “limma” package, and the findings were visually represented using the “ggplot2” package.

Identification of differentially expressed genes (DEGs) of the two taurine metabolism-related COAD subtypes

DEGs were identified through the application of the “limma” package in R based on a threshold of a |log2fold change (FC)| greater than 1 and an adjusted P value less than 0.05. Subsequently, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed on the identified DEGs using the “clusterProfiler” package in R to elucidate the molecular mechanisms.

Immune activity between the two taurine metabolism-related groups in COAD

The Tumor IMmune Estimation Resource (TIMER) methodology was employed to assess the immune score by using specific biomarkers that reflect the infiltration of immune cells in tumor specimens. The expression levels of these biomarkers were evaluated to gain more comprehensive insights into the functional dynamics of the immune checkpoints. The following key genes were found to be integral to immune regulation: ITPRIPL1, SIGLEC15, TIGIT, CD274, HAVCR2, PDCD1, CTLA4, LAG3, and PDCD1LG2. Statistical evaluations were performed using R software (version 4.0.3). A P value less than 0.05 was considered statistically significant.

The one-class logistic regression algorithm, as formulated by Malta et al., was applied to derive the messenger RNA expression-based stemness index (12). This algorithm uses features extracted from the gene expression data of 11,774 genes. The RNA expression data underwent processing via Spearman correlation analysis. Subsequently, the dryness index was normalized to a range of [0, 1] through linear transformation, which involved subtracting the minimum value and then dividing by the maximum value. All the aforementioned analytical techniques and R packages were executed using R Foundation for Statistical Computing [2020] (version 4).0.3. A P value <0.05 was considered statistically significant.

Creation of the taurine metabolism-related risk score

Initially, a multivariable Cox regression analysis was performed to narrow down the range of gene candidates for screening, after which, a stepwise iterative analysis was conducted to select the most optimal model, which ultimately served as the final model. The Cox proportional hazards model was employed to identify the genes significantly associated with taurine metabolism. Subsequently, the integrated risk score was established using the regression coefficients obtained from the multivariable Cox regression analysis, focusing on signatures derived from the training dataset. The formula for calculating the risk score is expressed as follows: risk score = Σ coefficient of (i) × expression level of the gene (i), where gene (i) signifies the gene identified in the analysis.

Here, the coefficient of gene (i) refers to the regression coefficient linked to gene (i), while the expression of gene (i) denotes the expression level of each candidate taurine metabolism-related gene (i) for each individual patient. The risk score for each patient was ascertained via the survival R package using the “predict” function. Patients were stratified into high- and low-risk categories based on the median risk score. The performance of the regression model was evaluated using Harrell’s concordance index. Furthermore, the clinical applicability of the taurine metabolism-related genes was assessed through the visualization of risk-score distributions and survival curves in TCGA cohort.

Statistical analysis

The bioinformatics analyses were conducted using R software (version 4.2.1). A two-tailed P value was used, and the results were deemed statistically significant if the P values fell below the threshold of 0.05. To assess survival differences among the groups, Kaplan-Meier curves were generated, and the variations in survival were analyzed using a two-tailed log-rank test. Additionally, the Wilcox test was applied to evaluate the discrepancies between the two groups.


Results

Identification of the prognostic taurine metabolism-related genes in COAD

To investigate the significance of the prognostic genes in COAD, a univariate Cox regression analysis was conducted to identify the genes associated with prognosis in patients from the TCGA-COAD cohort. This analysis identified a total of 597 genes with prognostic relevance in COAD; the top 20 prognostic genes are shown in Figure 1A. Additionally, we further examined the genes associated with taurine metabolism that showed prognostic potential in COAD. The taurine metabolism-related genes identified are presented in Figure 1B and include HSPB1, NOS2, LEP, KPNA2, SERPINA1, NR1H2, ENO2, HSPA1A, TRPV1, GSR, ALOX12, GABRD, TERT, CLCN3, AGMAT, NOTCH3, and MYB.

Figure 1 Identification of the prognostic taurine metabolism-related genes in COAD. (A) The top 20 prognostic genes in COAD were identified based on their P values. (B) The Venn analysis identified the prognostic taurine metabolism-related genes. CI, confidence interval; COAD, colon adenocarcinoma; HR, hazard ratio; SE, standard error.

NMF clustering identified two taurine metabolism-based subtypes

A total of 454 patients were included in the NMF clustering analysis, which was predicated on the comprehensiveness of the clinical data available. Using the expression profiles associated with the taurine metabolism-related genes from TCGA, the NMF algorithm effectively categorized the patients into two distinct expression patterns: cluster 1 (C1), which comprised 335 individuals, and cluster 2 (C2), which comprised 119 individuals. To assess the transcriptional profiles of these two inflammatory subtypes, a PCA was performed (Figure 2A). As Figure 2B shows, the clustering at k=2 exhibited a pronounced delineation between the two subtypes, indicating a substantial level of explanatory power and interpretability of the clustering, characterized by markedly high intra-cluster correlation and notably low inter-cluster correlation. Additionally, a heatmap revealed a distinct separation in the gene expression profiles of the 13 taurine metabolism-related genes between the two groups of patients from the TCGA-COAD cohort (Figure 2C). Notably, the statistically significant difference indicated that the patients in C1 had improved overall survival compared to those in C2 [hazard ratio (HR): 0.465; 95% confidence interval (CI): 0.312–0.692; P<0.001; Figure 2D].

Figure 2 Identification of the prognostic taurine metabolism-related genes in COAD. (A) A PCA was performed, which indicated the presence of two separate groupings among patients in TCGA-COAD dataset. (B) A consensus clustering matrix was generated that revealed two specific clusters in COAD. (C) A heatmap was constructed to visually depict the two separate clusters in COAD. (D) A Kaplan-Meier survival analysis was employed to assess the differences in overall survival between the two identified clusters. C1, cluster 1; C2, cluster 2; COAD, colon adenocarcinoma; HR, hazard ratio; PC, principal component; PCA, principal component analysis; TCGA, The Cancer Genome Atlas; var., variance.

Identification of the underlying mechanisms between the two clusters in COAD

To investigate the distinct mechanisms underlying the two clusters, we identified 199 DEGs based on a significance threshold of P<0.05 and |log2FC| >1. Of these DEGs, 19 were upregulated and 180 were downregulated as depicted in the volcano plot (C1 vs. C2; Figure 3A). The expression patterns of the leading DEGs displayed contrasting trends between the two clusters in the heatmap (Figure 3B).

Figure 3 Identification of the prognostic taurine metabolism-related genes in COAD. (A) The volcano plot identified a total of 199 DEGs based on a significance threshold of an adjusted P value of less than 0.05 and a |log2FC| greater than 1 (C1 vs. C2). (B) A heatmap showing the expression profiles of these DEGs in both C1 and C2. (C) The GO enrichment analysis of the identified DEGs. (D) The enriched KEGG pathways related to these DEGs. C1, cluster 1; C2, cluster 2; COAD, colon adenocarcinoma; DEGs, differentially expressed genes; ECM, extracellular matrix; FC, fold change; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; NS, not significant.

To further examine the biological processes of the 218 DEGs, GO and KEGG enrichment analyses were conducted. The results of the GO analysis for the biological processes indicated that these DEGs were predominantly involved in extracellular matrix (ECM) organization, extracellular structure organization, ossification, collagen fibril organization, cell-substrate adhesion, regulation of cell-substrate adhesion, collagen metabolic processes, endodermal cell differentiation, and endoderm formation (Figure 3C).

Moreover, the KEGG analysis results highlighted that the DEGs were primarily associated with pathways including ECM-receptor interaction, protein digestion and absorption, focal adhesion, complement and coagulation cascades, phagosome, human papillomavirus infection, and the PI3K-Akt signaling pathway (Figure 3D).

Immune activity between the two taurine metabolism-related groups in COAD

Previous study showed that taurine metabolism is intricately linked to immune function across various cancer types (13). Initially, the relationship between the expression of the taurine metabolism-related genes and immune scores was assessed by Spearman correlation analysis. The genes related to taurine metabolism (i.e., HSPB1, NOS2, LEP, KPNA2, SERPINA1, NR1H2, ENO2, HSPA1A, TRPV1, GSR, ALOX12, GABRD, TERT, CLCN3, AGMAT, NOTCH3, and MYB) were found to be significantly correlated with a variety of immune cell types (Figure 4A).

Figure 4 Identification of the prognostic taurine metabolism-related genes in COAD. (A) An analysis was performed to compare the enrichment scores of six distinct immune cell types in the context of the two taurine metabolism-related clusters in COAD. (B) The expression levels of the genes responsible for coding immune checkpoint inhibitors were evaluated between the two identified clusters in COAD. *, P<0.05; **, P<0.01; ***, P<0.001; ****, P<0.0001; ns, not significant (P>0.05). C1, cluster 1; C2, cluster 2; COAD, colon adenocarcinoma; TIMER, Tumor IMmune Estimation Resource.

Subsequently, we evaluated the immune activity between the two distinct taurine metabolism-related clusters in COAD patients. The boxplots revealed a marked difference in the immune cell populations (specifically, CD4+ T cells, neutrophils, macrophages, and myeloid dendritic cells) between C1 and C2.

Moreover, the boxplots also showed that eight of the 10 immune checkpoint inhibitor-related genes (i.e., CTLA4, HAVCR2, IGSF8, LAG3, PDCD1, PDCD1LG2, SIGLEC15, and TIGIT) were expressed at lower levels in C1 than C2 (Figure 4B). These findings provide further evidence of a significant association between taurine metabolism and immune activity.

Comparison of tumor stemness between COAD subtypes

Cancer stem cells (CSCs) play a pivotal role in the processes of tumorigenesis, recurrence, and metastasis, and are key contributors to both resistance to chemotherapy and the re-emergence of cancer. The CSC scores of the patients classified as C1 in the TCGA-COAD cohort were markedly elevated compared to those classified as C2 (Figure 5).

Figure 5 Comparison of tumor stemness between the COAD subtypes. The CSC scores were higher in the C1 patients than the C2 patients. ****, P<0.0001. C1, cluster 1; C2, cluster 2; COAD, colon adenocarcinoma; CSC, cancer stem cell; mRNA, messenger RNA; mRNAsi, mRNA expression-based stemness index.

Construction of a prognostic model based on taurine metabolism-related genes

Subsequent analyses using least absolute shrinkage and selection operator (LASSO) and Cox regression were conducted, and 17 genes associated with taurine metabolism in COAD were identified. A signature comprising nine genes was established based on the optimal λ value. The risk score was calculated using the following formula: risk score = (0.2533) × LEP + (−0.1142) × SERPINA1 + (0.2828) × ENO2 + (0.189) × HSPA1A + (−0.4269) × GSR + (0.9964) × GABRD + (0.3864) × TERT + (−0.3476) × NOTCH3 + (−0.2033) × MYB.

Using this gene signature, the patients from the TCGA-COAD cohort were stratified into low- and high-risk groups (Figure 6A). The overall survival analysis indicated that the patients in the low-risk group had superior survival outcomes compared to those in the high-risk group (HR: 3.489; 95% CI: 2.212–5.504; P=7.78e−08; Figure 6B). In addition, the efficacy of the prognostic model was evaluated via a receiver operating characteristic curve analysis, which revealed area under the curve values of 0.698 for 1 year, 0.699 for 3 years, and 0.73 for 5 years (Figure 6C). These findings indicate that the proposed model possesses a commendable prognostic capability.

Figure 6 Construction of a prognostic model based on taurine metabolism-related genes. (A) The prognostic gene signature was assessed using TCGA-COAD cohort, resulting in the creation of a heatmap showing the expression patterns of the identified prognostic genes. Subsequently, the patients were stratified into low- and high-risk categories. The X-axis of the heatmap reflects the arrangement of samples according to their risk scores in ascending order. (B) A Kaplan-Meier survival analysis was performed to evaluate the efficacy of the prognostic signature. (C) A time-dependent ROC curve analysis was conducted to determine the performance of the gene signature over a specified period. AUC, area under the curve; CI, confidence interval; COAD, colon adenocarcinoma; HR, hazard ratio; ROC, receiver operating characteristic; TCGA, The Cancer Genome Atlas.

Discussion

This study identified a total of 597 genes with prognostic relevance in COAD. Among these, a subset of genes associated with taurine metabolism was identified, which included HSPB1, NOS2, LEP, KPNA2, SERPINA1, NR1H2, ENO2, HSPA1A, TRPV1, GSR, ALOX12, GABRD, TERT, CLCN3, AGMAT, NOTCH3, and MYB. Using the expression profiles related to the taurine metabolism genes sourced from TCGA, the NMF algorithm successfully categorized patients into two distinct expression groups: C1, which comprised 335 patients, and C2, which comprised 119 patients. To better understand the mechanisms differentiating these clusters, 199 DEGs were identified. A GO analysis of these DEGs indicated their primary involvement in biological processes, such as the organization of the ECM, extracellular structure organization, ossification, collagen fibril organization, and cell-substrate adhesion. In addition, the KEGG analysis results suggested that the DEGs were predominantly linked to pathways including ECM-receptor interaction, protein digestion and absorption, and focal adhesion. Importantly, differences in immune activity were noted between the two clusters related to taurine metabolism in COAD. The CSC scores of the patients classified as C1 in the TCGA-COAD cohort were significantly elevated compared to those classified as C2. Additional analyses employing LASSO and Cox regression techniques led to the identification of 17 genes associated with taurine metabolism in COAD. Consequently, a prognostic model was established that included nine genes (i.e., LEP, SERPINA1, ENO2, HSPA1A, GSR, GABRD, TERT, NOTCH3, and MYB) to predict patient outcomes in COAD.

Taurine metabolism is crucial in CRC. It influences cancer progression and thus has therapeutic potential. This sulfur-containing amino acid has anti-inflammatory and anti-cancer properties, making it a promising biomarker and treatment target. Taurine can regulate cancer cell growth, apoptosis, and metastasis as shown by its suppression of CRC cell proliferation and metastasis, and the induction of apoptosis through EMT gene regulation and ERK/RSK pathway inhibition. It also counteracts hypotaurine-induced CRC progression, and thus has therapeutic promise. Additionally, its anti-cancer effects were confirmed in an azoxymethane/dextran sulfate sodium-induced mouse model of colon cancer (14). Taurine was shown to significantly inhibit tumor growth in this model, which suggests that it could serve as a chemo-preventive agent against CRC (15). Taurine was also shown to increase apoptosis markers and tumor suppressor proteins, which further supports its role in cancer prevention. A systematic review and meta-analysis found that taurine levels are significantly associated with CRC (6). Such findings highlight its potential as a diagnostic metabolite for distinguishing between benign and malignant growths, and improving screening accuracy (6).

A recent study has shown that SERPINA1, ENO2, and HSPA1A have crucial roles in CRC. SERPINA1, a serine protease inhibitor, is overexpressed in CRC and linked to poor outcomes by enhancing the STAT3 pathway (16). CEBPB binds to SERPINA1’s promoter, boosting its transcription and tumor growth, making SERPINA1 a potential prognostic marker and treatment target (16). Similarly, ENO2, a glycolytic enzyme, is dysregulated in CRC and negatively impacts prognosis (17). It facilitates CRC cell migration and invasion through interaction with the long non-coding RNA CYTOR, affecting LATS1 and YAP1, and inducing EMT; thus, ENO2 could serve as a therapeutic target (17). HSPA1A plays a crucial role in CRC progression as revealed by an analysis of ubiquitination-related genes (18). A gene signature based on these pathways has been shown to effectively predict patient survival by categorizing them into high or low risk (18). Reducing HSPA1A levels significantly decreases CRC cell growth and spread, which suggests that HSPA1A could serve as a target for personalized treatments (18). This research emphasizes the need to understand the molecular mechanisms of CRC, and has identified SERPINA1, ENO2, and HSPA1A as promising targets for the development of targeted therapies to enhance patient outcomes.

There are a number of limitations in this study. The study results stem from a retrospective analysis; thus, validation through prospective studies is essential. The dependence on historical data in our research might have introduced biases that could affect the reproducibility of our findings. While we acknowledge that relying solely on TCGA data introduces certain challenges, we believe that this dataset provides a substantial foundation for our analysis due to its comprehensive nature and the quality of data available. Additionally, subsequent studies should perform functional experiments on these genes to extend the understanding of its involvement in COAD.


Conclusions

The status of taurine metabolism-related genes is closely correlated with tumor classification and immunity in COAD. Our findings could inform the diagnosis and treatment of COAD.


Acknowledgments

None.


Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://jgo.amegroups.com/article/view/10.21037/jgo-2025-605/rc

Peer Review File: Available at https://jgo.amegroups.com/article/view/10.21037/jgo-2025-605/prf

Funding: None.

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://jgo.amegroups.com/article/view/10.21037/jgo-2025-605/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Siegel RL, Miller KD, Wagle NS, et al. Cancer statistics, 2023. CA Cancer J Clin 2023;73:17-48. [Crossref] [PubMed]
  2. Weitz J, Koch M, Debus J, et al. Colorectal cancer. Lancet 2005;365:153-65. [Crossref] [PubMed]
  3. Kawakami H, Zaanan A, Sinicrope FA. Microsatellite instability testing and its role in the management of colorectal cancer. Curr Treat Options Oncol 2015;16:30. [Crossref] [PubMed]
  4. Zhu S, Xing Y, Tu J, et al. Development and validation of predictive nomograms for survival in early-onset colon cancer patients with II-III stage across various tumor sites. Transl Cancer Res 2025;14:2233-49. [Crossref] [PubMed]
  5. Zheng Z, Sun X. Development of a nomogram predicting perineural invasion risk and assessment of the prognostic value of perineural invasion in colon cancer: a population study based on the Surveillance, Epidemiology, and End Results database. Transl Cancer Res 2025;14:141-58. [Crossref] [PubMed]
  6. Sinha A, Griffith L, Acharjee A. Systematic Review and Meta-Analysis: Taurine and Its Association With Colorectal Carcinoma. Cancer Med 2024;13:e70424. [Crossref] [PubMed]
  7. Jornada DH, Boreski D, Chiba DE, et al. Synergistic Enhancement of 5-Fluorouracil Chemotherapeutic Efficacy by Taurine in Colon Cancer Rat Model. Nutrients 2024;16:3047. [Crossref] [PubMed]
  8. Nam H, Lee W, Lee YJ, et al. Taurine Synthesis by 2-Aminoethanethiol Dioxygenase as a Vulnerable Metabolic Alteration in Pancreatic Cancer. Biomol Ther (Seoul) 2025;33:143-54. [Crossref] [PubMed]
  9. Luo Q, Zhou P, Chang S, et al. Construction and validation of a prognostic model for colon adenocarcinoma based on bile acid metabolism-related genes. Sci Rep 2023;13:12728. [Crossref] [PubMed]
  10. Machala M, Procházková J, Hofmanová J, et al. Colon Cancer and Perturbations of the Sphingolipid Metabolism. Int J Mol Sci 2019;20:6051. [Crossref] [PubMed]
  11. Cao S, Lun S, Duan L, et al. Harnessing Calmodulin-Related Genes to Build a Prognostic Model in Esophageal Squamous Cell Carcinoma for a Comprehensive Analysis of Single-Cell Immune Characteristics and Drug Efficacy. J Immunother 2025;48:244-57. [Crossref] [PubMed]
  12. Malta TM, Sokolov A, Gentles AJ, et al. Machine Learning Identifies Stemness Features Associated with Oncogenic Dedifferentiation. Cell 2018;173:338-354.e15. [Crossref] [PubMed]
  13. Qin Z, Huang G, Xu J, et al. Multidimensional transcriptomics based to illuminate the mechanisms of taurine metabolism in immune resistance of pancreatic cancer. Front Immunol 2025;16:1567805. [Crossref] [PubMed]
  14. Hou X, Hu J, Zhao X, et al. Taurine Attenuates the Hypotaurine-Induced Progression of CRC via ERK/RSK Signaling. Front Cell Dev Biol 2021;9:631163. [Crossref] [PubMed]
  15. Wang G, Ma N, He F, et al. Taurine Attenuates Carcinogenicity in Ulcerative Colitis-Colorectal Cancer Mouse Model. Oxid Med Cell Longev 2020;2020:7935917. [Crossref] [PubMed]
  16. Ma Y, Chen Y, Zhan L, et al. CEBPB-mediated upregulation of SERPINA1 promotes colorectal cancer progression by enhancing STAT3 signaling. Cell Death Discov 2024;10:219. [Crossref] [PubMed]
  17. Lv C, Yu H, Wang K, et al. ENO2 Promotes Colorectal Cancer Metastasis by Interacting with the LncRNA CYTOR and Activating YAP1-Induced EMT. Cells 2022;11:2363. [Crossref] [PubMed]
  18. Gao X, Yan T, Yu X, et al. Integrative analysis of ubiquitination-related genes identifies HSPA1A as a critical regulator in colorectal cancer progression. Med Oncol 2025;42:123. [Crossref] [PubMed]

(English Language Editor: L. Huleatt)

Cite this article as: Sun X, Dai J, Lin B, Sun Y, Chen L. Bioinformatics analysis of TCGA data identifies a taurine metabolism-related subtype classification for predicting prognosis in colon adenocarcinoma. J Gastrointest Oncol 2025;16(5):2127-2137. doi: 10.21037/jgo-2025-605

Download Citation