Abstract: Background and Objective: Lung adenocarcinoma (ADC) is a main subtype comprising nearly 50% of all lung cancer cases with high resistant rate and poor treatment. However, the mechanism is still unclear. In this research, bioinformatics was used to explore hub genes which closely associated with the pathogenesis of lung ADC to confirm novel diagnostic biomarkers and therapeutic targets. Materials and Methods: GEO database was used to screen differentially expressed genes (DEGs), then hub genes were clustered by Cytoscape. Biological function of hub genes was analyzed by database including DAVID and GEPIA. Furthermore, TCGA database was applied to confirmed expression of hub genes with relevant clinical data. Moreover, the Connectivity Map (CMap) was performed to explore potential molecule compounds which aim to reverse the expression of hub genes. Results: Hub genes including ASPM, CENPF and TRIP13 over expressed in solid tumors of lung ADC and recurrent tumors, compared with normal tissues. Especially, these three genes showed high mutation rate in lung ADC. As well, they were over expressed in the fourth stage of lung ADC, which were associated with overall survival and disease-free survival in lung ADC patients. Small molecules including phenoxybenzamine, adiphenine and resveratrol were predicted to reverse expression of ASPM, CENPF and TRIP13. Conclusion: ASPM, CENPF and TRIP13 may be the potential diagnostic biomarkers and therapeutic targets for lung ADC.
INTRODUCTION
Lung cancer is the leading cause of cancer-related death with increasing burden globally1. Appropriately, 75-80% of lung cancer is Non-Small Cell Lung Cancer (NSCLC)2,3. According to the classification of WHO, NSCLC are generally subcategorized into adenocarcinoma (ADC), squamous cell carcinoma and large cell carcinoma4,5, of which ADC is a main subtype comprising nearly 50% of all lung cancer cases6.
Up to date, several molecular targets such as Epidermal Growth Factor Receptor (EGFR)7, anaplastic lymphoma kinase (ALK)8-13 and ROS1 proto-oncogene (ROS1)14 have been proven to improve the therapeutic efficacy of lung ADC. However, the treatment has marginal effect in certain types of lung ADC. Tyrosine Kinase Inhibitors (TKIs)-targeted treatments have worked efficiently in patients with EGFR mutation but few of them can escape drug resistance15. Until now, half of the patients with lung ADC die within one year after diagnosis and 5 years survival is below 20%, however, the molecular mechanism of the occurrence, development, invasion and metastasis of lung ADC is still unclear16. Therefore, looking for the new molecular mechanisms of lung ADC is of great significance for the diagnosis, therapy for lung ADC patients.
In recent years, genetic screening for bioinformatics has been applied to search differentiated genes (DEGs) and their biological functions in lung ADC17-20. Of which, open-source databases like the Gene Expression Omnibus (GEO) have been critical in pooling collective knowledge, forming testable hypotheses and advancing translational breakthroughs21,22. Therefore, in this study, bioinformatics based on GEO was used to explore the novel molecular diagnosis biomarkers and treatment targets for lung ADC. Seeing flow chart for details (Fig. 1).
MATERIALS AND METHODS
Study area: The study was carried out at the First Affiliated Hospital of Hainan Medical University and the First Hospital of China Medical University from October, 2019 to July, 2021.
Microarray data: Microarray datasets including GSE31210 and GSE32863 were obtained and downloaded from GEO (https://www.ncbi.nlm.nih.gov/geo/).
Identification of DEGs and hub genes: GEO2R was used to identify the DEGs. Wherein, probe sets without exact gene symbols and genes with two or more probe sets were averaged. While, cutoff criteria was set to p<0.05 and log FC (fold change) >1. Then, interaction network was visualized by Cytoscape (version 3.4.0) and the Molecular Complex Detection (MCODE) was applied to screen hub genes with degree cutoff = 2, node score cutoff = 0.2, k-score = 2 and max. depth = 100.
Enrichment and visualization of hub genes: Hub genes which have a degree of connectivity of >40 were further stratified. Gene Ontology (GO) enrichment analysis of those hub genes were conducted by Cytoscape’s Biological Networks Gene Oncology (BiNGO) tool. P-value below 0.05 was considered statistically significant, while, results were visualized by Cytoscape. Furthermore, Keyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis was performed on DAVID (http://david.ncifcrf.gov), which is an online biological information database integrating biological data and provides a comprehensive set of functional annotation information of genes.
cBioPortal database analysis: Mutation, copy number variation (CNV) and gene co-occurrence of hub genes with degree >10 were analyzed by the cBio Cancer Genomics Portal (http://www.cbioportal.org/).
Fig. 1: | Flow chart of this study. Overall, Datasets (GSE31210 and GSE32863) were applied to screen DEGs. Then, MCODE was performed to explore hub genes, followed by enrichment and pathway analysis with cBioPortal and DAVID. Furthermore, the clinical significance of potential genes was verified by GEPIA |
The lung ADC datasets from TGCA including 887 cases were selected for further analysis.
GEPIA database analysis: The Gene Expression Profiling Interactive Analysis (GEPIA) database (http://gepia.cancer-pku.cn) is an interactive web that includes 9,736 tumors and 8,587 normal samples from TCGA and GTEx projects. GEPIA was performed to generate survival curves, including Overall Survival (OS) and Recurrence Free Survival (RFS), based on gene expression with the log-rank test and the Mantel-Cox test in lung ADC.
Drug discovery in the connectivity map (Cmap): CMap is an open resource which relates to disease, genes and drugs by similar or opposite gene expression profiles (https://portals.broadinstitute.org/cmap). In this study, CMap was applied to search potential small molecular compounds which could reverse expression of the hub genes. The link between the chemicals and query genes was measured via a connectivity score with mean <-0.4 and p<0.05.
RESULTS
Identification of hub genes in lung ADC: There were 2032 DEGs retrieved from GSE31210 and 211 DEGs from GSE32863 (Table 1). While, 118 DEGs were common genes in both data sets (Fig. 2a). Of which, 35 hub genes whose degree of connectivity were above 40 were selected from 118 DEGs (Fig. 2b). Especially, ASPM, CENPF and TRIP13 were hub genes, suggested that they may be the ideal genes for targeting deleterious behavior (Fig. 2c).
Table1: Properties of the gene expression data sets used for analysis | ||||
Cases (n=284) | ||||
GEO number | Date |
Stage I |
Stage II-IV |
DEGs |
GSE31210 | 2011 |
168 |
58 |
2032 |
GSE32863 | 2012 |
34 |
24 |
211 |
Fig. 2(a-c): | Identification of Hub genes in lung ADC, (a) Common genes (118 DEGs) from GSE31210 and GSE32863, (b) Interaction network of module genes (82 genes) from MCODE analysis cycles filled with yellow represented highly connected and (c) Interaction of 35 hub genes whose degree of connectivity were above 40 |
Fig. 3(a-b): | Enrichment and pathway analysis of 35 hub genes, (a) Molecular function of hub genes, (b) Pathways of hub genes |
Fig. 4(a-b): | Mutations and differentiated expression of 15 hub genes between lung ADC and normal tissues, (a) Multiple mutation types of 15 hub genes and (b) Differentiated expression of 15 hub genes |
Enrichment and pathways analysis of 35 hub genes: In terms of molecular function, those common DEGs were enriched in cell cycle regulation (Fig. 3a), while, cell cycle, oocyte meiosis and progesterone-mediated oocyte maturation were the most enrichment pathways (Fig. 3b).
Mutations and differentiated expression of ASPM, CENPF and TRIP13 between lung ADC and normal tissue: In order to verify the vital role of ASPM, CENPF and TRIP13, cBioPortal was performed to screen the cancer genome atlas (TCGA, provisional) lung adenocarcinoma samples. In TCGA database, 15 significant genes with degree >10 were further analyzed. It was shown that the mutation rate of ASPM, CENPF and TRIP13 was 17, 10 and 15%, respectively (Fig. 4a). Among 887 patient of lung ADC from TCGA, the expression of ASPM, CENPF and TRIP13 was twice higher in primary and recurrent lung ADC than that in normal solid tissue (Fig. 4b).
Detailly, triple isoforms of ASPM, CENPF and TRIP13 were high expression (log2 = 2.5 transcripts per million, TPM) in lung ADC compared to the normal tissue (Fig. 5a). Moreover, mutations including missense, diploid, gain, shallow deletion and amplification existed in the lung ADC samples (Fig. 5b). It illustrated that the expression of these three genes were significantly prognostic markers in progressive lung ADC.
Expression of ASPM, CENPF and TRIP13 related to the severity of lung ADC: Applied by GEPIA, overexpression of ASPM, CENPF and TRIP13 were demonstrated in lung ADC tissues compared to the normal one (Fig. 6a). Meanwhile, the expression of APSM, CENPF and TRIP13 were high in all four stages of lung ADC, especially, it was higher in patients diagnosed by stage 2, 3 and 4 than stage 1. Patients diagnosed by the fourth stage owned the highest expression among four stages (Fig. 6b). Results suggested that the expression of APSM, CENPF and TRIP13 could be related to the stage of lung ADC.
Based on data from 478 cases of lung ADC patients, it was shown that patients of lung ADC with increased expression of ASPM, CENPF and TRIP13 have a shorter Overall Survival (OS) and Diseases Free Survival (DFS) than those with decreased expression in overall survival (Fig. 6c and d).
Fig. 5(a-b): | Mutations and differentiated expression of ASPM, CENPF and TRIP13 between lung ADC and normal tissues, (a) Differentiated expression of triple isoforms of ASPM, CENPF and TRIP13 between lung ADC and normal tissues, (b) Putative copy-number alternations from GISTIC |
Fig. 6(a-d): | Relationship between expression of ASPM, CENPF and TRIP13 and the severity of lung ADC, (a) Verification of differential expression of ASPM, CENPF and TRIP13 between lung ADC (red box) and normal tissue (grey box), (b) Association of expression of ASPM, CENPF and TRIP13 with four stages of lung ADC, (c) Relationship between high expression of ASPM, CENPF and TRIP13 with shorter overall survival in lung ADC patients and (d) Relationship between high expression of ASPM, CENPF and TRIP13 with short disease free survival in lung ADC patients |
Table 2: Potential small molecular compounds that affect their expression and reverse altered expression of DEGs | |||||||
Rank | CMap name | Mean |
N |
Enrichment |
p-value |
Specificity |
Percent |
1 | Phenoxybenzamine | -0.776 |
4 |
-0.951 |
0 |
0.0091 |
100 |
2 | Adiphenine | 0.782 |
5 |
0.945 |
0 |
0 |
100 |
3 | Resveratrol | -0.759 |
9 |
-0.805 |
0 |
0 |
100 |
4 | Trichostatin A | -0.412 |
182 |
-0.327 |
0 |
0.4753 |
84 |
5 | Podophyllotoxin | 0.692 |
4 |
0.932 |
0.00004 |
0.0048 |
100 |
6 | 3-acetamidocoumarin | 0.655 |
4 |
0.912 |
0.00006 |
0 |
100 |
7 | Antimycin A | -0.665 |
5 |
-0.869 |
0.00008 |
0 |
100 |
8 | Trifluoperazine | -0.446 |
16 |
-0.495 |
0.00034 |
0.125 |
75 |
9 | Monobenzone | -0.698 |
4 |
-0.884 |
0.0004 |
0 |
100 |
10 | Prochlorperazine | -0.488 |
16 |
-0.492 |
0.00042 |
0.0472 |
93 |
Discovery drugs that correlate to APSM, CENPF and TRIP13: CMap was used to explore potential probes for ASPM, CENPF and TRIP13. There existed potential small molecular compounds which could reverse altered expression of ASPM, CENPF and TRIP13. Of which, phenoxybenzamine, adiphenine and resveratrol showed great potentialities (Table 2).
DISCUSSION
With the extensive research on the molecular pathogenesis of lung ADC by bioinformatics, more and more gene pathways and their regulatory mechanisms have been discovered23-25. In this study, it further illustrated that high expression and mutation rates of ASPM, CENPF and TRIP13 associated with lung ADC. Especially, these three hub genes were highly expressed in four stages of lung ADC, with the highest expression in the fourth stage.
Abnormal spindle-like microcephaly-associated protein (ASPM) is essential for mitotic spindle function during cell replication. Studies have confirmed that ASPM play an important role in different kind of cancer progression such as pancreatic ductal adenocarcinoma (PDAC)26. Besides, it also relates to the prognosis of hepatocellular carcinoma27, ovarian cancer28 and pancreatic cancer29.
Centromere Protein F (CENPF) is a cell cycle-associated nuclear antigen whose expression is down-regulated in G0/G1-cells, stacked in the nuclear matrix during S-phase and reached to maximal expression in G2/M-cells. Much more evidence showed that CENPF played an important role in progression of different tumors. On one hand, GENPF could be a synergistic master regulator of prostate cancer malignancy and a prognostic indicator of poor survival in metastasis30. On the other hand, CENPF could promote BC bone metastasis by activating PI3K-AKT-mTORC1 signaling which may be a novel therapeutic target for BC treatment31.
However, the role of ASPM and CENPF in the lung cancers is still unclear. In this study, using bio-informatic and experimental analysis, ASPM and CENPF were found to play a vital role in lung adenocarcinoma tumor. These two genes were found to be frequently up regulated expression and existed mutations in lung adenocarcinoma tissue. By performing Kaplan-Meier analyses, it showed that CENPF and ASPM were prognostic markers for clinical outcomes, which were similar to EGFR or KRAS32. These results suggested that they may be promoters and predictors for lung ADC.
Thyroid Hormone Receptor Interacting Protein 13 (TRIP13) gene is widely studied in human cancer. It involves in regulating mitotic processes, including spindle assembly checkpoint and DNA repair pathways, which participates in Chromosome instability (CIN)33,34. Amplification of TRIP13 has been observed in various human cancers and several aspects of malignant transformation, including cancer cell proliferation and drug resistance35. Previous study demonstrated that the expression of TRIP13 is positively correlated to tumor size, T-stage and N-stage in lung cancer36. In this research, overexpression of TRIP13 was closely related to progression of lung ADC agreed with recently studies in lung cell lines37,38, which suggested that TRIP13 may be a potential predictor for lung ADC.
Through CMap, phenoxybenzamine, adiphenine and resveratrol were shown the corresponding probes for ASPM, CENPF and TRIP13 by reversing their over expression. Of those drugs, resveratrol has been commonly pointed out as a putative cancer therapeutic agent. It showed that resveratrol could suppress the VEGF-mediated angiogenesis during cancer development39. Furthermore, in human colon cancer, resveratrol may suppress the invasion and metastasis of colon cancer through reversal of EMT via the AKT/GSK-3β/Snail signaling pathway40. However, few research was found for the other drugs which needs further study.
CONCLUSION
In conclusion, CENPF, ASPM and TRIP13 could be potential predictors and targets for lung ADC. Furthermore, phenoxybenzamine, adiphenine and resveratrol could also be potential drugs for lung ADC patients with over expression of CENPF, ASPM and TRIP13.