Construction of a co-expression network and prediction of metastasis markers in colorectal cancer patients with liver metastasis
Original Article

Construction of a co-expression network and prediction of metastasis markers in colorectal cancer patients with liver metastasis

Lihong Lin1#, Xiuxiu Zeng1#, Shanyan Liang1, Yunzhi Wang2, Xiaoyu Dai1, Yuechao Sun1,3*, Zhou Wu1*

1Department of Anorectal Surgery, Hwa Mei Hospital, University of Chinese Academy of Sciences, Ningbo, China; 2School of Health Sciences, University of Sydney, Lidcombe, NSW, Australia; 3Ningbo Institute of Life and Health Industry, University of Chinese Academy of Sciences, Ningbo, China

Contributions: (I) Conception and design: Y Sun, Z Wu; (II) Administrative support: Y Sun, Z Wu; (III) Provision of study materials or patients: L Lin, X Zeng; (IV) Collection and assembly of data: S Liang, Y Wang; (V) Data analysis and interpretation: X Zeng, X Dai; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

#These authors contributed equally to this work.

*These authors share co-corresponding authorship.

Correspondence to: Zhou Wu. Department of Anorectal Surgery, Hwa Mei Hospital, University of Chinese Academy of Sciences, 41 Xibei Street, Ningbo 315000, China. Email:; Yuechao Sun. Ningbo Institute of Life and Health Industry, University of Chinese Academy of Sciences, 499 Huancheng North Road, Ningbo 315000, China. Email:

Background: Colorectal cancer (CRC) is a common global malignancy associated with high invasiveness, high metastasis, and poor prognosis. CRC commonly metastasizes to the liver, where the treatment of metastasis is both difficult and an important topic in current CRC management.

Methods: Microarrays data of human CRC with liver metastasis (CRCLM) were downloaded from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database to identify potential key genes. Differentially expressed (DE) genes (DEGs) and DEmiRNAs of primary CRC tumor tissues and metastatic liver tissues were identified. Microenvironment Cell Populations (MCP)-counter was used to estimate the abundance of immune cells in the tumor micro-environment (TME), and weighted gene correlation network analysis (WGCNA) was used to construct the co-expression network analysis. Gene Ontology and Kyoto Encyclopaedia of Gene and Genome (KEGG) pathway enrichment analyses were conducted, and the protein-protein interaction (PPI) network for the DEGs were constructed and gene modules were screened.

Results: Thirty-five pairs of matched colorectal primary cancer and liver metastatic gene expression profiles were screened, and 610 DEGs (265 up-regulated and 345 down-regulated) and 284 DEmiRNAs were identified. The DEGs were mainly enriched in the complement and coagulation cascade pathways and renin secretion. Immune infiltrating cells including neutrophils, monocytic lineage, and cancer-associated fibroblasts (CAFs) differed significantly between primary tumor tissues and metastatic liver tissues. WGCN analysis obtained 12 modules and identified 62 genes with significant interactions which were mainly related to complement and coagulation cascade and the focal adhesion pathway. The best subset regression analysis and backward stepwise regression analysis were performed, and eight genes were determined, including F10, FGG, KNG1, MBL2, PROC, SERPINA1, CAV1, and SPP1. Further analysis showed four genes, including FGG, KNG1, CAV1, and SPP1 were significantly associated with CRCLM.

Conclusions: Our study implies complement and coagulation cascade and the focal adhesion pathway play a significant role in the development and progression of CRCLM, and FGG, KNG1, CAV1, and SPP1 may be metastatic markers for its early diagnosis.

Keywords: Colorectal liver metastasis; co-expression network; metastasis markers; complement system activation; tumor immune microenvironments

Submitted Sep 23, 2022. Accepted for publication Oct 18, 2022.

doi: 10.21037/jgo-22-965


Colorectal cancer (CRC) is one of the most common digestive tract malignancies worldwide, with the third highest incidence and second highest mortality (1). More than 2.2 million new cases of CRC are expected to occur globally by 2030, including more than 1.1 million deaths. Despite the continuous improvement of screening strategies and treatment methods, approximately 25% of CRC patients have already developed metastases at initial diagnosis and half of patients will develop metastatic disease (2,3). The 5-year survival rate of such advanced CRC patients is low, which seriously threatens human health (4). According to medical data, the liver is the most common target organ of metastasis from CRC (5,6).

Between 16% and 26% of patients have liver metastases at the time of diagnosis, and 18–25% will have liver metastases after primary radical resection (7). There has been a significant improvement in the survival of CRC including CRC with liver metastasis (CRCLM) in recent decades. Patients without any metastasis had a 5-year OS of 75.1% compared to 25.2%, 45.7%, and 12.7% respectively for patients with liver-only metastases, lung only metastases, and liver and lung metastases combined.

The key problems of liver metastasis in CRC currently are the lack of serum markers such as carcinoembryonic antigen (CEA), carbohydrate antigen (CA)199 which have high sensitivity and specificity in the diagnosis of CRCLM (8,9). Now the diagnosis of CRCLM still depends on imaging examination and focus biopsy, and the mechanism of CRCLM has not been fully elucidated (10). The treatment of liver metastasis is both difficult and an important topic in current CRC management. Several treatment options are available for patients with CRCLM, of which the surgical resection is the first choice. Other treatments include the chemotherapy, interventional therapy, chemoradiotherapy, targeted therapy and multidisciplinary comprehensive treatment (11-13).

The combination of systemic chemotherapy with resection has been shown to provide a 5-year survival of 50%, while only about 5% for patients treated with palliative intent (14,15). Therefore, there have been many attempts to explore the underlying mechanism of liver metastasis, and gene expression profiling has become a popular strategy to identify genes involved in the progression and the prognosis of different cancers including CRC (16-19).

Weighted gene co-expression network analysis (WGCNA) is a systematic biology method that describes the correlation patterns between genes in sequencing samples (20) by constructing weighted gene co-expression networks based on gene expression and dividing these into co-expression modules. Gene expression in the same co-expression module is similar and implies the similar functions. WGCNA has been widely used to identify key genes or therapeutic targets in many diseases, including various cancers (21-23).

In this study we used gene expression analysis in the diagnosis of liver metastasis from the aspects of gene molecules and signalling pathways to analyze the interaction between gene proteins for the diagnosis of CRCLM. Our results can be used to help the classification, screening, diagnosis, treatment, and prognosis of CRC. Further the results of potential biomarkers detection can be used to predict or diagnose whether CRC patients may suffer from liver metastasis, providing certain treatment options for subsequent targeted therapy. We present the following article in accordance with the STREGA reporting checklist (available at


Data extraction and integration

Microarray data of human CRC were downloaded from the Gene Expression Omnibus (GEO) ( containing 1,374 datasets. We screened the data of CRCLM, and finally filtered two databases, including GSE81558 (Affymetrix Human Gene Expression Array & Affymetrix Multispecies miRNA-3 Array, containing 19 primary CRC and paired liver metastases) and GSE35834 (Affymetrix Human Exon 1.0 ST Array & Affymetrix Multispecies miRNA-1 Array, containing 16 primary CRC and paired liver metastases). Among these 35 pairs of patients, 10 pairs were synchronous liver metastases, and 25 pairs were metachronous liver metastases. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013).

R package in SilicoMerging (24) was used to merge the two datasets and further removed batch effects (25) to obtain the final dataset, and a total of 35 cases of primary CRC and 35 paired liver metastases were included in the integrated analysis.

Identification of differentially expressed genes (DEGs)

To identify the features of CRCLM, the DEGs between primary CRC tissues and liver metastases tissues were analyzed. In addition, the DEGs between synchronous liver metastases and metachronous liver metastases were identified. Probes were converted into gene symbols, and the average expression value was used as the only value of the gene with multiple corresponding probes. Data were then transformed using a log2 transformation, and the DEGs were identified using limma package at a cut-off value |log2[fold change (FC)]| >1.2 and an adjusted P<0.05 (26). Kyoto Encyclopaedia of Gene and Genome (KEGG) pathway enrichment analysis of candidate DEGs was performed with the R-package clusterProfiler version 3.14.3. An adjusted P value <0.05 was considered statistically significant.

Construction of the competing endogenous RNA (ceRNA) network

The expression difference between the CRC and CRCLM was utilized to determine the DEmiRNAs and DEGs. The adjusted P value and the absolute log value of fold-change (log2|FC|) were calculated using R software with limma package. The criteria of P value <0.05 and log2|FC| >1.2 were adopted to select the DEmiRNAs. Targeted mRNAs of the collected DEmiRNAs were predicted using miRDB ( The selected targeted mRNAs were merged with DEGs. The overlapped gene sets were analyzed with Venn Plot, and the pairs of miRNAs and mRNAs were subsequently constructed.

Estimation of the tumor microenvironment (TME)

The Microenvironment Cell Populations (MCP)-counter (27) was used to estimate the abundance of T cells, CD8+ T cells, cytotoxic lymphocytes, B lymphocytes, natural killer (NK) cells, monocytic lineage, myeloid dendritic cells, neutrophils, endothelial cells (ECs), and fibroblasts in the TME for each sample. MCP-counter scores were defined as the log2 average expression of the TME for each population. The difference of the MCP-counter scores of the two groups was compared by Student’s t-test. A P value <0.05 was considered statistically significant. A higher score means the more significant ratio of the corresponding parts in the TME. In the subsequent WGCNA, these scores of immune infiltrating cells were further combined into module correlation analysis for mining immune-related modules and genes.

WGCNA for the determination of key genes

The WGCNA was used for constructing co-expression network analysis and mining modules related to clinical information (28). We screened genes with the top 25% by analysis of variance (ANOVA) and formed 12 effective modules by hierarchical clustering of the topological overlap matrix. Metastasis related genes were selected from the significant modules associated with metastasis and high level of immune infiltrations by correlation analysis and KEGG pathway enrichment analysis.

Regression models for predicting liver metastases

We used the STRING database to construct the protein-protein interaction (PPI) network of the above-mentioned genes and screened out hub genes with the most extensive associations with others. Cytoscape (Version 3.9.1) was used to map the network relationships of the above genes. The optimal subset regression (leaps, R) and the backward stepwise regression (MASS, R) were further used to select the core role genes in the process of CRCLM. Finally, the obtained Akaike Information Criterion (AIC) value reached the minimum value and a regression model for predicting liver metastasis in CRC was constructed.

Statistical analysis

To calculate the mean and standard deviation, the numerical data were analyzed by one-way ANOVA. A two-tailed Student’s t-test was used to verify the significance of the differences between the groups. P value <0.05 was considered as statistical significance.


Data pre-processing

To identify novel gene signatures in CRCLM patients, we analyzed gene expression profiles by microarray (available at, and after the batch effects across platforms were removed and adjusted (Figure S1), a total of 17,541 probes were selected for further analysis.

Through the following analysis (Figure 1), we constructed the co-expression network and best-fitting logistic regression models, and screened the potential biomarkers associated with CRCLM.

Figure 1 Flowchart of construction and analysis of co-expression networks. WGCNA, weighted gene correlation network analysis; MCP, Microenvironment Cell Populations; DEGs, differentially expressed genes; KEGG, Kyoto Encyclopaedia of Gene and Genome; PPI, protein‑protein interaction; GSEA, gene set enrichment analysis.

Identification of DEGs between the primary tumor and liver metastases, and establishment of the DEGs-DEmiRNAs regulatory network

To identify the DEGs associated with CRCLM, we performed differential expression analysis on the primary tumor tissues and metastatic liver tissues of the above 35 CRC patients. As shown in Figure 2A and vailable at, a total of 610 DEGs were identified, of which 265 were up-regulated and 345 were down-regulated. After KEGG functional enrichment analysis, the down-regulated DEGs showed significant enrichment in renin secretion, the cyclic adenosine monophosphate (cAMP) signalling pathway, and pathways in cancer (Figure 2B), while up-regulated DEGs were enriched in complement and coagulation cascades, chemical carcinogenesis, and retinol metabolism (Figure 2C).

Figure 2 Analysis of differential expression genes in CRCLM. (A) The volcano plot shows identification of the DEmRNAs in the primary tumor tissues and metastatic liver tissues. The red colour represents up-regulated genes, while the green colour indicates down-regulated genes. KEGG functional enrichment circle map of down-regulated (B) and up-regulated (C) genes. (D) Construction of the miRNA-mRNA regulatory network. CRCLM, colorectal cancer with liver metastasis; DE, differentially expressed; KEGG, Kyoto Encyclopaedia of Gene and Genome.

We further analyzed the mRNA-miRNA interaction to determine whether some DEGs regulated the liver metastasis by miRNA, and based on the aforementioned cut-off criteria, a set of 284 DEmiRNAs were identified (available at, among which only 11 DEmiRNAs have been reported in the miRDB database (available at The DEmiRNAs-mRNA regulatory network comprised of eight DEmiRNAs, 17 down-regulated DEGs, and six up-regulated DEGs (Figure 2D).

As the DEGs were enriched in the complement and coagulation cascade pathways and renin secretion, it was necessary to further explore the relationship between the immune infiltration environment and CRC metastasis. However, no DEGs related to these two pathways were identified in the DEmiRNAs-DEGs regulatory network, suggesting genes associated with these two pathways regulate CRCLM but not through miRNAs.

Differential expression analysis of CRC with synchronous and metachronous liver metastases

As synchronous metastases of CRC are considered to hold worse prognostic value compared with metachronous metastases (29,30), we further explored the potential factors responsible for the differences in tumor behaviour. We identified 186 DEGs in the colorectal tissues of synchronous liver metastases versus metachronous liver metastases from CRC, among which 99 DEGs were up-regulated and 87 were down-regulated (Figure 3A and table available at Moreover, 58 up-regulated DEGs and 44 down-regulated DEGs were identified in liver tissues between synchronous liver metastases and metachronous liver metastases of CRC (Figure 3B and Table S1).

Figure 3 Differential expression analysis of different liver metastasis types in CRC patients. Heat map of DEGs between synchronous and metachronous liver metastases in primary colorectal tissues (A) and metastatic liver tissues (B) (the top 60 DEGs are shown). (C) Venn diagram shows the up-regulated DEGs and down-regulated DEGs in colorectal tissues and metastatic liver tissues of CRCLM. CRC, colorectal cancer; DEGs, differentially expressed genes; CRCLM, colorectal cancer with liver metastasis.

A Venn diagram was then used to display the number of DEGs between the two tissues, and the overlapping DEGs were further analyzed (Figure 3C). Notably, RPL22L1, SNCAIP, SYNPR, and KRT40 were highly expressed in metachronous liver metastases, and CRISP1, PIWIL4, TRIM36, ATP12A, and ZDHHC11B were highly expressed in synchronous liver metastases.

Immune infiltrating cells may be associated with CRCLM

Immune cell infiltrations were then characterized in colorectal tissues and liver tissues. We first calculated the abundance of immune cells using the MCP-counter algorithm, and the results showed cancer-associated fibroblasts (CAFs) were more abundant than any other cell types in the TME (Figure 4) (Table S2). Moreover, there was a significant difference in the level of CAFs between primary tumor tissues and metastatic liver tissues (P=0.04).

Figure 4 MCP-counter scores of the infiltrating immune cells in primary colorectal tissues (A) and metastatic liver tissues (B). MCP, Microenvironment Cell Populations; NK, natural killer.

While neutrophils showed the most significant difference in MCP-counter score (P=0.0003) between the two tissues, other immune cells including monocytic lineage (P=0.02), ECs (P=0.05), and B lymphocytes (P=0.04) also differed significantly. This indicates there are significant changes in the immune environment between primary and metastatic lesions, although whether there is a causal relationship or correlation with metastasis needs to be further explored.

Construction of weighted gene co-expression network and PPI network

WGCNA can build gene co-expression networks to mine network modules closely related with clinical traits, and was used to determine the interactions of the correlation between immune infiltrating cells and genes. A total of 12 modules were obtained by clustering the dissimilarity based on the consensus topological overlap, and the clustering dendrograms and module-trait relationship are shown in Figure 5A and online (available at

Figure 5 Construction of WGCNA network and PPI network. (A) A gene dendrogram containing 12 modules was obtained by clustering the dissimilarity based on consensus topological overlap. Each module contains a set of highly connected genes and is represented by different colours. The correlation between consensus modules and the clinical indicators (including the abundance of immune factors) of CRCLM was constructed. The intensity and direction of correlation are indicated by different colours (red, positive; blue, negative), the numbers in the table represent the correlation coefficients, and P values are printed in parentheses below. PPI network of genes from the yellow (B) and brown (C) modules. The size of the ellipse corresponds to the combined score among the gene members in the module, and the magenta circles (B) and yellow circles (C) represent the gene sets (KEGG enrichment analysis) of the most enriched pathways in their respective module networks. DC, dendritic cell; WGCNA, weighted gene correlation network analysis; PPI, protein‑protein interaction; CRCLM, colorectal cancer with liver metastasis; ME, module eigengene; NK, natural killer; KEGG, Kyoto Encyclopaedia of Gene and Genome.

As there were 441 genes identified in the yellow module and it had the highest association (0.6) with liver metastases in the heat map of the module-trait relationship, this module was selected for further analysis. The results showed 441 genes were significantly enriched in complement and coagulation cascades (Figure S2A) and, interestingly, only neutrophils among the immune infiltration cell types had a strong correlation (0.79) with the module. Based on the STRING database, 145 of the 441 genes had high interactions (combined score >0.4), including 41 genes involved in complement and coagulation cascades (Figure 5B and Figure S2B). Collectively, we speculated the complement and coagulation cascade pathways might be the key pathways for the liver metastasis of CRC, and the differences may be caused by changes to the TME.

The brown module had a strong correlation with ECs (0.83) and CAFs (0.87), and KEGG analysis (Figure S2C) results demonstrated these genes were mainly involved in the focal adhesion and phosphoinositide 3-kinase (PI3K)-Akt signalling pathways. Further, 158 genes were identified in the brown module (Figure 5C), and 21 were involved in regulation of the focal adhesion pathway (Figure S2D).

In summary, we identified the yellow module and brown module as significantly related to CRCLM, and the 62 genes with significant interactions were mainly related to complement and coagulation cascade and focal adhesion pathways.

Construction of best-fitting logistic regression models and identification of potential metastasis markers

Best subset regression selects the best model from all possible subsets according to some goodness-of-fit criteria (31,32). To find the best fit model for CRCLM from all possible subset models, we first performed optimal subset regression analysis of 41 hub genes associated with complement and coagulation cascades in the yellow module and 10 genes were identified (adjusted R2=0.85, Figure S3A). Following the same method, eight genes (adjusted R2=0.75) were screened from the set of 21 hub genes related to the focal adhesion pathway in the brown module (Figure S3B). The best subset regression analysis and backward stepwise regression analysis were performed on the above 18 genes, and finally, eight genes (adjusted R2=0.9, AIC =−240.56) were determined, including F10, FGG, KNG1, MBL2, PROC, SERPINA1, CAV1, and SPP1 (Figure S3C,S3D).

The best-fitting logistic regression model was then established to calculate the probability of liver metastasis in CRC. When P value >0.5, CRC samples were deemed to be more prone to have liver metastases, while P value <0.5 indicated a low risk of developing liver metastases.

In addition, four genes including FGG, KNG1, CAV1, and SPP1 were significantly associated with CRCLM (P<0.0001, Figure S3D), and we hypothesized they might be potential metastatic markers for its early diagnosis.


CRC is characterized by an aggressive phenotype, poor clinical outcomes, and a high rate of mortality, especially when liver metastasis occurs (33,34). Accordingly, there is an urgent need to find potential molecular biomarkers for CRC that could help to enhance the diagnosis and treatment efficacy of CRCLM.

Recent studies have shown numerous molecular biomarkers in cancer (35-37). For example, KNG1 and FGG have been found to be correlated with the progression and prognosis of hepatocellular carcinoma (38,39), and SPP1 and CAV1 with overall survival in lung adenocarcinoma (40,41). Currently, there are no suitable serum markers with have high sensitivity and specificity in the diagnosis of CRCLM. Tumor markers such as CEA, CA19-9, CA242, CA72-4 in combination with biochemical markers such lactase dehydrogenase (LDH), gamma glutamyl transpeptidase (GGT) for liver function may improve the sensitivity and specificity for screening liver metastases in patients with CRC (42). Combining evaluation of SATB2 with CK20 and CDX2 to form a three-marker panel in liver biopsy tissues by immunohistochemistry could be used to detect the CRCLM (43). Our results revealed 62 genes with significant interactions which were mainly related to complement and coagulation cascade and the focal adhesion pathway, and the FGG, KNG1, CAV1, and SPP1 genes were significantly associated with liver metastasis in CRC. Taken together, these findings suggest these genes might be metastatic markers for the early diagnosis of CRCLM.

Gene expression can be influenced by different factors including miRNAs (44,45), and to explore the possible roles of miRNAs in the complement and coagulation cascade pathways and renin secretion, we further analyzed the mRNA-miRNA interaction in miRNA array. However, no DEGs related to these two pathways were identified in the DEmiRNAs-DEGs regulatory network, suggesting the genes of these two pathways may not regulate CRC metastasis through miRNAs.

Previous studies have revealed the relationships between immune cell infiltration, tumor purity, and biomarker gene expression to be quite valuable for developing appropriate immunotherapy (46,47). Studies have explored the correlations between tumor purity and marker genes expression to predict the clinical outcomes in different cancers (47,48). Liu et al. have assessed the immune cells in normal and cancerous human head and neck squamous cell carcinoma (HNSC) tissues using the CIBERSORT method (49) and found different cells, including neutrophil, B cells, CD4+ T cells, and CD8+ T cells were increased in the cancerous tissues compared with normal controls. Consistently, in our study, we also uncovered significant differences in neutrophils, monocytic lineage, and CAFs between primary tumor tissue and their liver metastases, which may help clinicians gain deeper insights into the TME landscape of CRCLM (Figure 6).

Figure 6 The complement and focal adhesion pathway in the TME. MBL, mannan-binding lectin; MASP1, MBL-associated serine protease 1; MASP2, MBL-associated serine protease 2; MAC, membrane attack complex; FAC, focal adhesion complex; IACs, integrin adhesion complexes; NK, natural killer; TME, tumor micro-environment.

TME is a population of cells constituted with tumor cells and stromal cells recruited by tumor cells including T cells, macrophages, fibroblasts and so on (50). In CRCLM, the liver-infiltrating immune cells include neutrophils, myeloid-derived suppressor cells (MDSCs), Monocyte, tumor-associated macrophages (TAMs), metastasis-associated macrophages (MAMs), NK cells and so on (51). Moreover, the infiltrating immune cells found in the primary tumor of CRC differs from those in liver metastases in terms of their spatial distribution. More immunosuppressive cells are present in the liver metastases, with the most pronounced differential distribution found for macrophages (52). The infiltrated immune cells in the TME are closely associated with tumorigenesis and metastasis, as well as the cellular response to therapy. CD3+ and CD8+ T cell density correlates with better prognosis and high regulatory T cell (Treg cell) infiltrate correlates with shorter overall survival (53,54). Both the density and morphology of TAMs correlate with survival in patients with CRCLM (55,56). Lymphocyte infiltration and plasma cell infiltration are linked to prolonged patient survival in CRCLM (57).

The composition of the TME includes complex immune factors which play an important role in liver metastasis of CRC. The lectin pathway is one of three pathways of complement pathway and sees MBL and MASP2 form a complex, which cleaves C4 and C2 and generates C3 convertase. C3 convertase promotes the proliferative activation of complement and promotes the formation of C5 convertase on the cell surface, which binds to C6, C7, and C8, forming a C5b-8 complex, which polymerizes several C9 molecules, forming the membrane attack complex (MAC). Complement system activation and generation of anaphylatoxins orchestrate an inflammatory and immune responses. Anaphylatoxins activate macrophages, neutrophils, and monocytic lineages, resulting in the production of cytokines which increase vascular permeability and enhance neutrophil extravasation and chemotaxis (58-60). ECs and fibroblasts are also key players in the TME and produce complement proteins, express regulators, and a lower level of complement receptors.

Force sensing between cells and cell matrix via integrin and their associated integrin adhesion complexes (IACs) is involved in cell migration, matrix remodelling, and mechanosensing (61). In addition, focal adhesion kinase (FAK), regulated by integrin signalling, plays an important role in promoting TME remodelling, and activated FAK can promote angiogenesis in ECs (62,63) and recruit immune cells such as fibroblasts (64). However, direct evidence on how genes including FGG, KNG1, CAV1, and SPP1 regulate liver metastasis and immune cells infiltration in CRC is lacking, and the precise pathways and mechanisms requires further study.


Funding: This work was supported by the National Natural Science Foundation of China (No. 42006093), the Medical Health Science and Technology project of Ningbo (No. 2021Y31), the Natural Science Foundation of Ningbo (No. 2021J323), the Research Initiation Project of the Ningbo Institute of Life and Health Industry, the Chinese Academy of Sciences (Nos. 2021YJY1010 and 2022YJY0204), the Ningbo Medical Key Discipline (No. 2022-B11), and the Public Welfare Foundation of Ningbo (No. 2021S10).


Reporting Checklist: The authors have completed the STREGA reporting checklist. Available at

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013).

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See:


  1. Sung H, Ferlay J, Siegel RL, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin 2021;71:209-49. [Crossref] [PubMed]
  2. Carlomagno C, De Stefano A, Rosanova M, et al. Multiple treatment lines and prognosis in metastatic colorectal cancer patients. Cancer Metastasis Rev 2019;38:307-13. [Crossref] [PubMed]
  3. Van Cutsem E, Oliveira JESMO Guidelines Working Group. Advanced colorectal cancer: ESMO clinical recommendations for diagnosis, treatment and follow-up. Ann Oncol 2009;20:61-3. [Crossref] [PubMed]
  4. Li N, Lu B, Luo C, et al. Incidence, mortality, survival, risk factor and screening of colorectal cancer: A comparison among China, Europe, and northern America. Cancer Lett 2021;522:255-68. [Crossref] [PubMed]
  5. Siegel RL, Miller KD, Fuchs HE, et al. Cancer Statistics, 2021. CA Cancer J Clin 2021;71:7-33. [Crossref] [PubMed]
  6. Engstrand J, Nilsson H, Strömberg C, et al. Colorectal cancer liver metastases - a population-based study on incidence, management and survival. BMC Cancer 2018;18:78. [Crossref] [PubMed]
  7. Horn SR, Stoltzfus KC, Lehrer EJ, et al. Epidemiology of liver metastases. Cancer Epidemiol 2020;67:101760. [Crossref] [PubMed]
  8. Michl M, Taverna F, Kumbrink J, et al. Biomarker alterations associated with distinct patterns of metastatic spread in colorectal cancer. Virchows Arch 2021;478:695-705. [Crossref] [PubMed]
  9. Bai R, Shi Z, Li D, et al. Gene expression profile of human colorectal cancer identified NKTR as a biomarker for liver metastasis. Aging (Albany NY) 2022;14:6656-67. [Crossref] [PubMed]
  10. Zhou H, Liu Z, Wang Y, et al. Colorectal liver metastasis: molecular mechanism and interventional therapy. Signal Transduct Target Ther 2022;7:70. [Crossref] [PubMed]
  11. Zheng W, Wu F, Fu K, et al. Emerging Mechanisms and Treatment Progress on Liver Metastasis of Colorectal Cancer. Onco Targets Ther 2021;14:3013-36. [Crossref] [PubMed]
  12. Alhumaid A, AlYousef Z, Bakhsh HA, et al. Emerging paradigms in the treatment of liver metastases in colorectal cancer. Crit Rev Oncol Hematol 2018;132:39-50. [Crossref] [PubMed]
  13. Van Cutsem E, Cervantes A, Adam R, et al. ESMO consensus guidelines for the management of patients with metastatic colorectal cancer. Ann Oncol 2016;27:1386-422. [Crossref] [PubMed]
  14. Nordlinger B, Sorbye H, Glimelius B, et al. Perioperative FOLFOX4 chemotherapy and surgery versus surgery alone for resectable liver metastases from colorectal cancer (EORTC 40983): long-term results of a randomised, controlled, phase 3 trial. Lancet Oncol 2013;14:1208-15. [Crossref] [PubMed]
  15. Xu YXX, Ye YB, Chen T, et al. Role of hepatic surgery in colorectal cancer multiple liver metastasis. Zhonghua Wai Ke Za Zhi 2021;59:816-20. [PubMed]
  16. Chakravarty D, Solit DB. Clinical cancer genomic profiling. Nat Rev Genet 2021;22:483-501. [Crossref] [PubMed]
  17. Moncada R, Barkley D, Wagner F, et al. Integrating microarray-based spatial transcriptomics and single-cell RNA-seq reveals tissue architecture in pancreatic ductal adenocarcinomas. Nat Biotechnol 2020;38:333-42. [Crossref] [PubMed]
  18. Latha NR, Rajan A, Nadhan R, et al. Gene expression signatures: A tool for analysis of breast cancer prognosis and therapy. Crit Rev Oncol Hematol 2020;151:102964. [Crossref] [PubMed]
  19. Mousavi M, Goodarzi MT, Kassaee SM, et al. Clinicopathological, Immunohistochemical, and PMS2 Gene Expression Profiling of Patients with Sporadic Colorectal Cancer. Arch Iran Med 2021;24:86-93. [Crossref] [PubMed]
  20. Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 2008;9:559. [Crossref] [PubMed]
  21. Zhang Y, Luo J, Liu Z, et al. Identification of hub genes in colorectal cancer based on weighted gene co-expression network analysis and clinical data from The Cancer Genome Atlas. Biosci Rep 2021;41:BSR20211280. [Crossref] [PubMed]
  22. Zheng H, Liu H, Li H, et al. Weighted Gene Co-expression Network Analysis Identifies a Cancer-Associated Fibroblast Signature for Predicting Prognosis and Therapeutic Responses in Gastric Cancer. Front Mol Biosci 2021;8:744677. [Crossref] [PubMed]
  23. Chen M, Chen S, Yang D, et al. Weighted Gene Co-expression Network Analysis Identifies Crucial Genes Mediating Progression of Carotid Plaque. Front Physiol 2021;12:601952. [Crossref] [PubMed]
  24. Taminau J, Meganck S, Lazar C, et al. Unlocking the potential of publicly available microarray data using inSilicoDb and inSilicoMerging R/Bioconductor packages. BMC Bioinformatics 2012;13:335. [Crossref] [PubMed]
  25. Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 2007;8:118-27. [Crossref] [PubMed]
  26. Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 2015;43:e47. [Crossref] [PubMed]
  27. Becht E, Giraldo NA, Lacroix L, et al. Estimating the population abundance of tissue-infiltrating immune and stromal cell populations using gene expression. Genome Biol 2016;17:218. [Crossref] [PubMed]
  28. Su R, Jin C, Zhou L, et al. Construction of a ceRNA network of hub genes affecting immune infiltration in ovarian cancer identified by WGCNA. BMC Cancer 2021;21:970. [Crossref] [PubMed]
  29. van Rooijen KL, Shi Q, Goey KKH, et al. Prognostic value of primary tumour resection in synchronous metastatic colorectal cancer: Individual patient data analysis of first-line randomised trials from the ARCAD database. Eur J Cancer 2018;91:99-106. [Crossref] [PubMed]
  30. Bakkers C, Lurvink RJ, Rijken A, et al. Treatment Strategies and Prognosis of Patients With Synchronous or Metachronous Colorectal Peritoneal Metastases: A Population-Based Study. Ann Surg Oncol 2021;28:9073-83. [Crossref] [PubMed]
  31. Chowdhury MZI, Turin TC. Variable selection strategies and its importance in clinical prediction modelling. Fam Med Community Health 2020;8:e000262. [Crossref] [PubMed]
  32. Zhang Z. Variable selection with stepwise and best subset approaches. Ann Transl Med 2016;4:136. [Crossref] [PubMed]
  33. Liu W, Zhang W, Xu Y, et al. A Prognostic Scoring System to Predict Survival Outcome of Resectable Colorectal Liver Metastases in this Modern Era. Ann Surg Oncol 2021;28:7709-18. [Crossref] [PubMed]
  34. Nishioka Y, Kawaguchi Y, Kothari AN, et al. Prognostic and Therapeutic Implications of Tumor Biology, Including Gene Alterations, in Colorectal Liver Metastases. J Gastrointest Surg 2021;25:1591-600. [Crossref] [PubMed]
  35. Su CY, Huang GC, Chang YC, et al. Analyzing the Expression of Biomarkers in Prostate Cancer Cell Lines. In Vivo 2021;35:1545-8. [Crossref] [PubMed]
  36. Nunes-Xavier CE, Zaldumbide L, Mosteiro L, et al. Protein Tyrosine Phosphatases in Neuroblastoma: Emerging Roles as Biomarkers and Therapeutic Targets. Front Cell Dev Biol 2021;9:811297. [Crossref] [PubMed]
  37. Lou J, Zhang L, Lv S, et al. Biomarkers for Hepatocellular Carcinoma. Biomark Cancer 2017;9:1-9. [Crossref] [PubMed]
  38. Bai Q, Liu H, Guo H, et al. Identification of Hub Genes Associated With Development and Microenvironment of Hepatocellular Carcinoma by Weighted Gene Co-expression Network Analysis and Differential Gene Expression Analysis. Front Genet 2020;11:615308. [Crossref] [PubMed]
  39. Gu Y, Li J, Guo D, et al. Identification of 13 Key Genes Correlated With Progression and Prognosis in Hepatocellular Carcinoma by Weighted Gene Co-expression Network Analysis. Front Genet 2020;11:153. [Crossref] [PubMed]
  40. Luo X, Feng L, Xu W, et al. Weighted gene co-expression network analysis of hub genes in lung adenocarcinoma. Evol Bioinform Online 2021;17:11769343211009898. [Crossref] [PubMed]
  41. Wang H, Zhang Z, Xu K, et al. Exploration of estrogen receptor-associated hub genes and potential molecular mechanisms in non-smoking females with lung adenocarcinoma using integrated bioinformatics analysis. Oncol Lett 2019;18:4605-12. [Crossref] [PubMed]
  42. Wu XZ, Ma F, Wang XL. Serological diagnostic factors for liver metastasis in patients with colorectal cancer. World J Gastroenterol 2010;16:4084-8. [Crossref] [PubMed]
  43. Zhang YJ, Chen JW, He XS, et al. SATB2 is a Promising Biomarker for Identifying a Colorectal Origin for Liver Metastatic Adenocarcinomas. EBioMedicine 2018;28:62-9. [Crossref] [PubMed]
  44. Ali Syeda Z, Langden SSS, Munkhzul C, et al. Regulatory Mechanism of MicroRNA Expression in Cancer. Int J Mol Sci 2020;21:1723. [Crossref] [PubMed]
  45. Mollaei H, Safaralizadeh R, Rostami Z. MicroRNA replacement therapy in cancer. J Cell Physiol 2019;234:12369-84. [Crossref] [PubMed]
  46. Lawal B, Lin LC, Lee JC, et al. Multi-Omics Data Analysis of Gene Expressions and Alterations, Cancer-Associated Fibroblast and Immune Infiltrations, Reveals the Onco-Immune Prognostic Relevance of STAT3/CDK2/4/6 in Human Malignancies. Cancers (Basel) 2021;13:954. [Crossref] [PubMed]
  47. Gong Z, Zhang J, Guo W. Tumor purity as a prognosis and immunotherapy relevant feature in gastric cancer. Cancer Med 2020;9:9052-63. [Crossref] [PubMed]
  48. Deng Y, Song Z, Huang L, et al. Tumor purity as a prognosis and immunotherapy relevant feature in cervical cancer. Aging (Albany NY) 2021;13:24768-85. [Crossref] [PubMed]
  49. Liu G, Yuan C, Ma J, et al. Influence of Immune Microenvironment on Diagnosis and Prognosis of Head and Neck Squamous Cell Carcinoma. Front Oncol 2021;11:604784. [Crossref] [PubMed]
  50. Baghban R, Roshangar L, Jahanban-Esfahlan R, et al. Tumor microenvironment complexity and therapeutic implications at a glance. Cell Commun Signal 2020;18:59. [Crossref] [PubMed]
  51. Zeng X, Ward SE, Zhou J, et al. Liver Immune Microenvironment and Metastasis from Colorectal Cancer-Pathogenesis and Therapeutic Perspectives. Cancers (Basel) 2021;13:2418. [Crossref] [PubMed]
  52. He Y, Han Y, Fan AH, et al. Multi-perspective comparison of the immune microenvironment of primary colorectal cancer and liver metastases. J Transl Med 2022;20:454. [Crossref] [PubMed]
  53. Wang Y, Lin HC, Huang MY, et al. The Immunoscore system predicts prognosis after liver metastasectomy in colorectal cancer liver metastases. Cancer Immunol Immunother 2018;67:435-44. [Crossref] [PubMed]
  54. Katz SC, Bamboat ZM, Maker AV, et al. Regulatory T cell infiltration predicts outcome following resection of colorectal cancer liver metastases. Ann Surg Oncol 2013;20:946-55. [Crossref] [PubMed]
  55. Donadon M, Torzilli G, Cortese N, et al. Macrophage morphology correlates with single-cell diversity and prognosis in colorectal liver metastasis. J Exp Med 2020;217:e20191847. [Crossref] [PubMed]
  56. Cavnar MJ, Turcotte S, Katz SC, et al. Tumor-Associated Macrophage Infiltration in Colorectal Cancer Liver Metastases is Associated With Better Outcome. Ann Surg Oncol 2017;24:1835-42. [Crossref] [PubMed]
  57. Kurebayashi Y, Kubota N, Sakamoto M. Immune microenvironment of hepatocellular carcinoma, intrahepatic cholangiocarcinoma and liver metastasis of colorectal adenocarcinoma: Relationship with histopathological and molecular classifications. Hepatol Res 2021;51:5-18. [Crossref] [PubMed]
  58. Holers VM. Complement and its receptors: new insights into human disease. Annu Rev Immunol 2014;32:433-59. [Crossref] [PubMed]
  59. Afshar-Kharghan V. The role of the complement system in cancer. J Clin Invest 2017;127:780-9. [Crossref] [PubMed]
  60. Lu P, Ma Y, Wei S, et al. The dual role of complement in cancers, from destroying tumors to promoting tumor development. Cytokine 2021;143:155522. [Crossref] [PubMed]
  61. Humphries JD, Chastney MR, Askari JA, et al. Signal transduction via integrin adhesion complexes. Curr Opin Cell Biol 2019;56:14-21. [Crossref] [PubMed]
  62. Wu HJ, Hao M, Yeo SK, et al. FAK signaling in cancer-associated fibroblasts promotes breast cancer cell migration and metastasis by exosomal miRNAs-mediated intercellular communication. Oncogene 2020;39:2539-49. [Crossref] [PubMed]
  63. Chen XL, Nam JO, Jean C, et al. VEGF-induced vascular permeability is mediated by FAK. Dev Cell 2012;22:146-57. [Crossref] [PubMed]
  64. Wang X, Zhou Q, Yu Z, et al. Cancer-associated fibroblast-derived Lumican promotes gastric cancer progression via the integrin β1-FAK signaling pathway. Int J Cancer 2017;141:998-1010. [Crossref] [PubMed]

(English Language Editor: B. Draper)

Cite this article as: Lin L, Zeng X, Liang S, Wang Y, Dai X, Sun Y, Wu Z. Construction of a co-expression network and prediction of metastasis markers in colorectal cancer patients with liver metastasis. J Gastrointest Oncol 2022;13(5):2426-2438. doi: 10.21037/jgo-22-965

Download Citation