Research | Open | Published:
Identification and validation of potential prognostic lncRNA biomarkers for predicting survival in patients with multiple myeloma
Journal of Experimental & Clinical Cancer Researchvolume 34, Article number: 102 (2015)
Dysregulated long non-coding RNAs (lncRNAs) have been found to have oncogenic and/or tumor suppressive roles in the development and progression of cancer, implying their potentials as novel independent biomarkers for cancer diagnosis and prognosis. However, the prognostic significance of expression profile-based lncRNA signature for outcome prediction in patients with multiple myeloma (MM) has not yet been investigated.
LncRNA expression profiles of a large cohort of patients with MM were obtained and analyzed by repurposing the publically available microarray data. An lncRNA-focus risk score model was developed from the training dataset, and then validated in the testing and another two independent external datasets. The time-dependent receiver operating characteristic (ROC) curve was used to evaluate the prognostic performance for survival prediction. The biological function of prognostic lncRNAs was predicted using bioinformatics analysis.
Four lncRNAs were identified to be significantly associated with overall survival (OS) of patients with MM in the training dataset, and were combined to develop a four-lncRNA prognostic signature to stratify patients into high-risk and low-risk groups. Patients of training dataset in the high-risk group exhibited shorter OS than those in the low-risk group (HR = 2.718, 95 % CI = 1.937-3.815, p <0.001). The similar prognostic values of four-lncRNA signature were observed in the testing dataset, entire GSE24080 dataset and another two independent external datasets. Multivariate Cox regression and stratified analysis showed that the prognostic power of four-lncRNA signature was independent of clinical features, including serum beta 2-microglobulin (Sβ2M), serum albumin (ALB) and lactate dehydrogenase (LDH). ROC analysis also demonstrated the better performance for predicting 3-year OS. Functional enrichment analysis suggested that these four lncRNAs may be involved in known genetic and epigenetic events linked to MM.
Our results demonstrated potential application of lncRNAs as novel independent biomarkers for diagnosis and prognosis in MM. These lncRNA biomarkers may contribute to the understanding of underlying molecular basis of MM.
Multiple myeloma (MM) is an incurable cancer of plasma cells caused by abnormal accumulation of monoclonal plasma cells in bone marrow . MM is one of the most common blood cancers and is characterized by wide clinical and pathophysiologic heterogeneities leading to fatal outcome. The survival periods of patients with MM varied significantly, ranging from a few weeks to more than 10 years, and the five-year survival rate is nearly 40 % . Identifying patients who are at the high risk may help optimize the choice of personalized treatment and improve clinical outcomes.
It is well known that the vast majority (>90 %) of the human genome sequence can be actively transcribed, while less than 2 % of transcripts serve as mRNA to encode protein [3, 4]. A substantial fraction of transcripts is non-coding RNA (ncRNA) with no or limited protein coding capacity, including short ncRNA and long ncRNA. Long non-coding RNA (lncRNA), constituting an important class of ncRNA, are mRNA-like transcripts which are transcribed by RNA polymerase II and are longer than 200 nucleotides in length [5, 6]. Accumulating evidence indicates that lncRNA function as important regulators involved in diverse aspects of gene regulation at transcriptional, posttranscriptional and epigenetic levels [5, 7], and participate in a variety of biological processes [8, 9]. The aberrant lncRNA expression has also been observed in many complex human diseases, especially in cancers [10–12]. Similar to mRNA and miRNA, these dysregulated lncRNAs can play oncogenic and/or tumor suppressive roles in the development and progression of cancer. Some well-characterized lncRNAs, such as MALAT1, HOTAIR and SRA, were found to be highly up-regulated in lung cancer, breast cancer, hepatocellular cancer and so on [13–15], while MEG3, GAS5 and LincRNA-p21 have been shown to be tumor suppressors [16–18]. These cancer-associated lncRNAs displayed aberrant expression patterns in tissue- or cancer-type specific manner [19, 20], suggesting their potentials as novel independent biomarkers for cancer diagnosis and prognosis. Several expression-based lncRNA signatures have been established in glioblastoma multiforme , breast cancer , oesophageal squamous cell carcinoma , colorectal cancer  and lung cancer . For multiple myeloma, recent studies have also found that lncRNAs MALAT1 and MEG3 are overexpressed in patients with MM compared to healthy individuals by real-time quantitative reverse transcription polymerase chain reactions (RT-PCR) analysis [26, 27]. However, the prognostic significance of lncRNA signature in patients with MM remains unknown.
In the present study, by integrating lncRNA expression profiles and matched clinical information in a large cohort of patients with MM, we identified four prognostic lncRNA biomarkers associated with overall survival of patients with MM and established a four-lncRNA-focus prognostic risk model that can effectively predict clinical survival. The significant prognostic power of four-lncRNA-focus prognostic risk model was further validated in testing dataset and another two independent external patient datasets.
GEO datasets and clinical information of patients with MM
The gene microarray expression data and corresponding clinical information of a large number of patients with MM used in this study were obtained from publicly available Gene Expression Omnibus (GEO) database, including 559 patients from GSE24080 (Affymetrix HG-U133_Plus_2.0 array) (www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE24080) , 55 patients from GSE57317 (Affymetrix HG-U133_Plus_2.0 array) (www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE57317)  and 264 patients from GSE9782 (Affymetrix HG-U133A array) (www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE9782) . Detailed clinical information of patients with MM used in this study was shown in Additional file 1.
Microarray analysis and lncRNA re-annotation
The probe ID-centric gene expression data was normalized using the MAS5 algorithm and log2 transformed. To obtain lncRNA expression profiles of patients with MM, the microarray probes were re-annotated as previously described . Briefly, the probes (probe sets) from Affymetrix HG-U133_Plus_2.0 array and Affymetrix HG-U133A array were re-mapped to the human genome (GRCh38) using SeqMap tool . Then those probes (probe sets) that were uniquely mapped to the human genome with no mismatch were retained for further analysis. By matching the chromosomal position of retaining probes (probe sets) to the chromosomal position of lncRNA from the GENCODE project (http://www.gencodegenes.org, release 22) , we obtained 3215 probes (probe sets) covering 2330 lncRNAs for Affymetrix HG-U133_Plus_2.0 array and 855 probes (probe sets) covering 663 lncRNAs for Affymetrix HG-U133A array, respectively. The expression data of multiple probes (probe sets) mapping to the same lncRNA were integrated by using the arithmetic mean to represent the expression level of single lncRNA.
Identification of potential prognostic lncRNA biomarkers associated with OS in patients with MM
A univariate Cox regression analysis was carried out to evaluate the association between expression levels of lncRNAs and patients’ OS. Those lncRNAs whose expression levels were significantly associated with patients’ OS were fitted in a multivariate Cox regression analysis in the training dataset by using OS as the dependent variable and other clinical information as the covariables. We kept those lncRNAs with p value <0.01 to develop a risk score model for predicting OS in patients with MM. The lncRNA-based risk score model was defined as the linear combination of the expression values of the prognostic lncRNAs and the multivariable Cox regression coefficient as the weight. The patients with MM in each dataset were classified into high-risk group and low-risk group according to the median risk score derived from the training dataset.
Differences in patients’ OS between high-risk group and low-risk group were demonstrated using the Kaplan-Meier survival curves, and the statistical significance was obtained using the two-sided log-rank test. Univariate and multivariate analyses with Cox proportional hazards regression were carried out with OS as the dependent variable and lncRNA risk score and other individual clinical features as explanatory variables in each dataset. Hazard ratios (HR) and 95 % confidence intervals (CI) were calculated. The time-dependent receiver operating characteristic (ROC) curve was used to evaluate the prognostic performance for survival prediction of the lncRNA risk score and the area under the ROC curves (AUC) value were calculated. All the analysis was performed using the R/Bio-Conductor (version 3.1.1).
Functional enrichment analysis
The Pearson correlation coefficient was utilized to evaluate co-expression relationship between lncRNA and mRNA. The functional enrichment analysis of co-expressed mRNAs was performed to predict biological function of lncRNA using the DAVID Bioinformatics Tool (http://david.abcc.ncifcrf.gov/, version 6.7), which is widely used bioinformatics resources [34, 35]. The enriched results was reported limited to Gene Ontology (GO) terms in the “Biological Process”(GOTERM-BP-FAT) and Kyoto encyclopedia of genes and genomes (KEGG) pathway categories using the functional annotation clustering and functional annotation chart options. The GO terms and KEGG pathways with p value of <0.05 and Enrichment score > 2 was considered as significantly enriched function annotations.
Identification of prognostic lncRNA biomarkers associated with patients’ OS from the training dataset
The 559 patients with MM from GSE24080 were randomly split into the training dataset (n = 280) and the testing dataset (n = 279). We first conducted a univariate Cox regression analysis for expression data of each lncRNA with OS as a dependent variable to measure the relationship between lncRNA expression and patients’ OS. A total of 59 lncRNAs, whose expression levels were significantly associated with patients’ OS (p < 0.01), were identified and ranked according to their univariate z scores (Fig. 1). Of these, the high expression levels of 40 lncRNAs with negative z scores were associated with longer OS, and high expression levels of the remaining 19 lncRNAs with positive z scores were associated with shorter OS. In order to evaluate whether these lncRNAs have independently predictive power to predict patients’ OS when considering the mutual effect among 59 lncRNAs and clinical features, a multivariate regression analysis was further performed on the expression levels of 59 lncRNAs with OS as a dependent variable and other individual clinical features as explanatory variables in the training dataset. When considering the mutual effect among 59 lncRNAs and clinical features, only 4 of 59 lncRNAs (RP4-803 J11.2, RP1-43E13.2, RP11-553 L6.5, ZFY-AS1) showed predictive power and were able to independently predict patients’ OS at a statistically significant level of 0.01 (Fig. 1). The detailed information of these four lncRNAs was summarized in Table 1. To build a predictive model that should be independent of other factors (such as clinical features or other lncRNAs), we only used these four lncRNAs to construct risk score model.
Construction and validation of lncRNA-focus risk score model for predicting OS in the training dataset
To construct a predictive model, these four lncRNAs were fitted in a multivariate Cox regression model with OS as a dependent variable to measure relative contributions for survival prediction. Then a lncRNA-focus risk score model for OS prediction was developed by integrating the expression data of these four lncRNA and corresponding coefficient derived from above multivariate regression analysis, as follows: Risk score = (0.3016 × expression value of RP4-803 J11.2) + (−0.2989 × expression value of ZFY-AS1) + (0.3191 × expression value of RP1-43E13.2) + (−0.1445 × expression value of RP11-553 L6.5). The risk score of each patient in the training dataset was calculated according to the lncRNA-focus risk score model. Then 280 patients of training dataset were assigned to a high-risk group (n = 140) or a low-risk group (n = 140) using the median risk score as the cutoff point. The result of Kaplan-Meier analysis showed significant differences in patients’ OS between high-risk group and low-risk group (log-rank test p < 0.001, Fig. 2a). Patients in the high-risk group had significantly shorter OS (mean 58.7 months) than those in the low-risk group (mean 104.6 months). The univariate Cox regression analysis also demonstrated that the risk scores derived from the four-lncRNA signature was significantly correlated with patients’ OS with risk scores as a continuous variable (p < 0.001, HR = 2.718, 95 % CI = 1.937–3.815) (Table 2). The expression of lncRNAs RP4-803 J11.2 and RP1-43E13.2 tended to be up-regulated and the remaining two lncRNAs (ZFY-AS1 and RP11-553 L6.5) were down-regulated for patients in high-risk group (Fig. 2b).
Performance evaluation of lncRNA-focus risk score model for survival prediction in the testing and entire GSE24080 datasets
To evaluate the prognostic power of lncRNA-focus risk score model for patients’ OS prediction, this risk score model and cutoff point derived from the training dataset was applied to patients with MM in the testing dataset and the entire GSE24080 dataset. The 279 patients of the testing dataset were classified into either high-risk group (n = 119) or low-risk group (n = 160). Kaplan-Meier curves for the two groups within the testing dataset is shown in Fig. 3a, demonstrating a significant difference in OS between high-risk group and low-risk group (log-rank test p = 0.054). Patients in the high-risk group exhibited poorer OS (mean 68.5 months) than those in the low-risk group (mean 78.3 months). The significant association between risk score and OS has also been observed in the testing dataset with risk scores as a continuous variable in the univariate Cox regression analysis (p = 0.014, HR = 1.579, 95 % CI = 1.099–2.270) (Table 2). The distribution of risk score, survival status and lncRNA expression in the testing dataset of 279 patients is shown in Fig. 3b. Patients in the high-risk group tended to express risky lncRNAs (RP4-803 J11.2 and RP1-43E13.2) at higher level than those in the low-risk group, whereas patients in the low-risk group tended to express protective lncRNAs (ZFY-AS1 and RP11-553 L6.5) at higher level than those in the high-risk group. In consistent with the finding in the training dataset and testing dataset, Kaplan-Meier and univariate Cox regression analysis showed that this lncRNA-focus risk score model was able to separate 559 patients in the entire GSE24080 dataset into two groups with significantly different OS (mean 63.3 months versus 100.1 months, HR = 2.099, 95 % CI = 1.638–2.688; p < 0.001, log-rank test) (Fig. 3c). The distribution of risk score, survival status and lncRNA expression also yielded similar results (Fig. 3d).
Further validation of lncRNA-focus risk score model for survival prediction in another two independent external patient datasets
To further examine the robustness and practical application of the four-lncRNA risk score model, we validated the prognostic power of the risk score model using lncRNA expression values and survival information of patients with MM in another two independent external datasets (GSE57317 and GSE9782). As shown in Fig. 4a, the lncRNA-focus risk score model could effectively predict OS in patients with MM from GSE57317 (log-rank test p = 0.053). All 55 patients in the GSE57317 dataset were divided into the high-risk group (n = 26) and the low-risk group (n = 29) with significant different OS according to the same risk score cutoff point derived from the training dataset (mean 31.3 months versus 37.5 months, HR = 2.64, 95 % CI = 1.013–6.879, p = 0.047). Another external patient dataset (GSE9782) was based on the Affymetrix U133A array platform. After probe re-annotating, we found that only 3 lncRNA (RP1-43E13.2, RP11-553 L6.5, ZFY-AS1) of four prognostic lncRNAs from the training dataset were covered on the Affymetrix U133A array. So, the risk score model only based on these three lncRNAs without re-estimating parameters was used to predict OS for GSE9782 dataset, which perhaps reduce the predictive power. The median risk score cutoff point obtained from GSE9782 dataset classified 264 patients into the high-risk group (n = 132) and the low-risk group (n = 132). The Kaplan-Meier curves for the high-risk group and the low-risk group in the independent external GSE9782 dataset are shown in Fig. 4b. Patients assigned into high-risk group tended to have shorter OS than those in the low-risk group (mean OS 18.4 months vs. 22.4 months, log-rank test p = 0.016). The univariate Cox regression analysis also showed that the risk scores were significantly associated with OS in patients with MM in the GSE9782 dataset (HR = 1.637, 95 % CI = 1.107–2.42, p = 0.014). The results of risk score distribution, survival status and lncRNA expression for GSE57317 and GSE9782 were consistent with those observed in the training dataset (Fig. 4c and d).
Independence of lncRNA-focus risk score model for survival prediction from clinical features
To assess whether the prognostic values of lncRNA-focus risk score model is independent of other important clinical features of patients with MM, the multivariate Cox regression analyses were performed with OS as the dependent variable and lncRNA risk score and other clinical features as explanatory variables in each dataset. The multivariate Cox regression analyses showed that lncRNA-focus risk score was significantly correlated with OS of patients with MM after adjusting for various clinical features in the training dataset (HR = 2.066, CI = 1.395–3.06, p < 0.001), testing dataset (HR = 1.726, CI = 1.113–2.675, p = 0.015), GSE24080 dataset (HR = 1.905, CI = 1.434–2.53, p < 0.001) and another independent external patient dataset GSE9782 (HR = 1.909, CI = 1.269–2.87, P = 0.002; Table 2) (There was no available clinical features in GSE57317 dataset). We also found that higher levels of serum beta 2-microglobulin (Sβ2M), serum albumin (ALB) and lactate dehydrogenase (LDH) were significant in the multivariate analysis. However, the estimation of hazard ratios of lncRNA-focus risk score for OS is 1.905 (p < 0.001) is higher than that of Sβ2M (HR = 1.574, p = 0.011), ALB (HR = 1.519, p = 0.036) and LDH (HR = 1.645, p = 0.003) (Table 2), suggesting that lncRNA-focus risk score model may be more powerful prognostic factor than established laboratory prognostic parameters. Next, data stratification analysis was then performed according to these three significant clinical features. All patients of GSE24080 were stratified into patient group with higher Sβ2M level (≥3.5 mg/L) or patient group with lower Sβ2M level (<3.5 mg/L). All 239 patients with higher Sβ2M level were divided into the high-risk group (n = 119) with shorter OS or the low-risk group (n = 120) with longer OS (mean 44.9 vs. 93.4 months, log-rank test p < 0.001) (Fig. 5a). For the patient with lower Sβ2M level, patients with low-risk scores (n = 180) also had longer OS (mean 97.1 months) than those with high-risk scores (n = 140) (mean 75.2 months), although the p value of 0.068 was slightly above the 0.05 significance level (Fig. 5b). Another clinical feature, ALB, stratified the entire GSE24080 patients into two subgroups with higher (≥35 g/L) or lower (<35 g/L) levels of ALB. The lncRNA-focus risk score model could effectively classify patients into high-risk group and low-risk group with significantly different OS for both two subgroups with higher or lower levels of ALB (mean 66.4 vs. 100.8 months, log-rank test p < 0.001 for 482 patients with higher level of ALB, and mean 44 vs. 74.1 months, log-rank test p = 0.001 for 77 patients with lower level of ALB) (Fig. 5c and d). Significant differences for OS between high-risk group and low-risk group also were observed for stratified subgroup by LDH level (mean 51.5 vs. 67.9 months, log-rank test p = 0.037 for 168 patients with LDH > 190 U/L, and mean 69.1 vs. 104.4 months, log-rank test p < 0.001 for 391 patients with LDH ≤ 190 U/L) (Fig. 5e and f). Taken together, the results of multivariate Cox regression analyses and stratification analysis suggested that the predictive power of lncRNA-focus risk score is independent of other clinical features for OS of patients with MM.
Performance comparison by time-dependent ROC curve analysis
We performed the time-dependent ROC curve analysis to compare sensitivity and specificity for survival prediction between lncRNA-focus risk score model and an established UAMS 17-gene prognostic model  in the GSE24080 dataset, GSE57317 dataset and GSE9782. The AUC value was obtained from ROC analysis and compared between these two predictive models. In the GSE24080 dataset and GSE9782 dataset, the lncRNA-focus risk score model achieved AUC value of 0.682 and 0.595, which is higher than those (AUC = 0.666 and 0.572) derived from UAMS 17-gene prognostic model (Fig. 6), indicating that the predictive ability of lncRNA-focus risk score model was better than UAMS 17-gene prognostic model in GSE24080 and GSE9782 datasets. However, in the GSE57317 dataset, established UAMS 17-gene prognostic model had a higher AUC value than our lncRNA-focus risk score model (0.937 vs. 0.656, Fig. 6).
Functional prediction of prognostic lncRNA biomarkers
To explore the functional implication of four prognostic lncRNA biomarkers in MM tumorigenesis and development, we performed bioinformatics analysis to predict lncRNA functions. We first calculated the Pearson correlation coefficient between lncRNA and mRNA by examining the paired lncRNA and mRNA expression profiles of 559 patients with MM in the GSE24080 dataset. The top 1 % mRNA was selected as co-expressed mRNAs with prognostic lncRNA biomarkers. A total of 789 mRNAs were positively or negatively correlated with at least one of the four prognostic lncRNAs (see Additional file 2). Functional enrichment analysis showed that these co-expressed mRNAs with prognostic lncRNAs were significantly enriched in 104 GO terms and 9 KEGG pathways (p < 0.05 and Fold Enrichment > 2) (see Additional file 3), which are mainly involved in six functional clusters including cell cycle, chromatin modification, DNA replication, microtubule-based process, DNA repair and RNA processing (Table 3). We further examined whether there were any important genes of interest identified from integrative analysis of lncRNA-mRNA. We found that 135 of 789 co-expressed mRNAs (corresponding to 62 genes) with four prognostic lncRNAs are known cancer genes recorded in NCG database (http://ncg.kcl.ac.uk/index.php, version 4.0)  (see Additional file 4), which is a manually curated cancer gene repository. Especially, gene NRAS has been verified experimentally to be associated with MM .
During the past years, great progress in our understanding of the initiation and progression of multiple myeloma has been witnessed [39, 40]. However, the clinical outcome of patients with MM still remains highly heterogeneous. Traditional laboratory parameters Sβ2M and serum albumin, referred to the International Staging System (ISS), have been used as an objective staging system . Subsequent cytogenetic studies found that cytogenetic abnormalities, such as 13q14 deletion and t(4;14) translocation, also can provide valuable prognostic information [42, 43]. However, both ISS and cytogenetic abnormalities demonstrated limited ability for therapeutic risk stratification. With the development of high-throughput techniques, expression profiles-based molecular signatures have been reported in various types of cancers and have become more powerful prognostic tool to predict patient outcomes [44, 45]. Several multigene-expression signatures, including UAMS 17-gene , IFM 15-gene  and EMC 92-gene  models, have been developed to predict survival in patients with MM. More recently, dysregulation of lncRNA expression were observed in the newly diagnosed patients with MM, indicating their potentials as biomarkers for diagnosis and prognosis in MM . However, the prognostic significance of expression profile-based lncRNA signature for outcome prediction in patients with MM has not yet been investigated.
In this study, we have investigated the lncRNA expression profiles of a large cohort of patients with MM by repurposing the publically available microarray data. Through integrative analysis of lncRNA expression data with clinical features, 59 lncRNAs were found to be significantly associated with patients’ OS in MM. After considering interrelation among 59 lncRNAs and clinical features using multivariate analysis, we identified four prognostic lncRNAs that were able to independently predict patients’ OS. Two (RP4-803 J11.2 and RP1-43E13.2) of four lncRNAs are located chromosome 1 that has been proven to be key players in MM progression , and their expression correlated with shortened survival. Then an lncRNA-focus risk model was developed by incorporating expression patterns of four prognostic lncRNAs and their relative contributions in the multivariate analysis in the training dataset. By applying this four-lncRNA-based risk model to the patients of training dataset, a better risk stratification for patients’ outcome was observed between survival curves of patients with high-risk or low-risk scores. Patients in the high-risk group had significant shorter OS than those in the low-risk group. Further validation of the four-lncRNA-based risk model constructed in the training dataset showed similar prognostic power in the testing dataset and another two independent external patient datasets. These analyses suggested that the prognostic value of the four-lncRNA-based risk model is robust and reliable for survival prediction in MM.
We next performed multivariate analysis to test whether the prognostic power of the four-lncRNA-based risk model for survival prediction is independent of known prognostic variables and other clinical features. The estimations of HR for OS were 2.066, 1.726, 1.905 and 1.909 in the training, testing, entire GSE24080 and GSE9782 datasets, respectively. Also, some known prognostic variables, including Sβ2M, serum albumin and LDH revealed significant correlation with patients’ OS. So we carried out stratification analysis for Sβ2M, ALB and LDH to further evaluate the independence of the four-lncRNA-based risk model for survival prediction. The results of stratification analysis suggested that the four-lncRNA-based risk model was able to effectively classify patients into high-risk group and low-risk group with significantly different OS for both two subgroups stratified by three clinical prognostic variables. These results of multivariate analysis, taken together with stratification analysis, demonstrated that the four-lncRNA-based risk model was an independent prognostic factor for survival prediction in MM.
Although substantial computational evidence for the prognostic significance of the lncRNA signature in MM has been revealed, the underlying mechanisms of these four prognostic lncRNAs in the development of MM were still unclear. So we performed an integrative analysis of lncRNA-mRNA by utilizing the matched lncRNA and mRNA expression profiles to infer functional implication of these four prognostic lncRNAs. The functional enrichment analysis of mRNA co-expressed with lncRNAs revealed that the biological functions annotated to the four prognostic lncRNAs mainly involve cell cycle, chromatin modification, DNA replication, microtubule-based process, DNA repair and RNA processing. These functions are all of essential significance contributing to the initiation and progression of MM . Although bioinformatics analysis indicated that these four prognostic lncRNAs may play significant role in the initiation and progression of MM through associations with known genetic and epigenetic events linked to MM, further experimental validation of these four prognostic lncRNAs is necessary for understanding their functional roles in MM.
In summary, we identified four prognostic lncRNA biomarkers that are significantly associated with OS of patients with MM and developed an lncRNA-focus risk model for survival prediction by integrating lncRNA expression profiles with clinical features of a large cohort of patients with MM. The four-lncRNA signature could robustly predict OS of patients with MM. The prognostic power of the four-lncRNA signature was independent of known laboratory prognostic factors and other clinical features, and exhibited superior performance compared to known traditional clinical parameters and multigene signature to some extent. These results demonstrated potential application of lncRNAs as novel independent biomarkers for diagnosis and prognosis in MM. Moreover, identification of lncRNA biomarkers perhaps brings novel insights into the understanding of underlying molecular basis of MM.
Area under the ROC curve
Gene Expression Omnibus
International Staging System
Kyoto Encyclopedia of Genes and Genomes
Long non-coding RNAs
Reverse transcription polymerase chain reactions
Receiver operating characteristic
Serum beta 2-microglobulin
Palumbo A, Anderson K. Multiple myeloma. N Enql J Med. 2011;364(11):1046–60. doi:10.1038/nsmb.3001.
Decaux O, Lode L, Magrangeas F, Charbonnel C, Gouraud W, Jezequel P, et al. Prediction of survival in multiple myeloma based on gene expression profiles reveals cell cycle and chromosomal instability signatures in high-risk patients and hyperdiploid signatures in low-risk patients: a study of the Intergroupe Francophone du Myelome. J Clin Oncol. 2008;26(29):4798–805. doi:10.1200/JCO.2007.13.8545.
Consortium EP, Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007;447(7146):799–816. doi:10.1038/nature05874.
International Human Genome Sequencing C. Finishing the euchromatic sequence of the human genome. Nature. 2004;431(7011):931–45. doi:10.1038/nature03001.
Mercer TR, Dinger ME, Mattick JS. Long non-coding RNAs: insights into functions. Nat Rev Genet. 2009;10(3):155–9. doi:10.1038/nrg2521.
Ponting CP, Oliver PL, Reik W. Evolution and functions of long noncoding RNAs. Cell. 2009;136(4):629–41. doi:10.1016/j.cell.2009.02.006.
Kornienko AE, Guenzl PM, Barlow DP, Pauler FM. Gene regulation by the act of long non-coding RNA transcription. BMC Biol. 2013;11:59. doi:10.1186/1741-7007-11-59.
Fatica A, Bozzoni I. Long non-coding RNAs: new players in cell differentiation and development. Nat Rev Genet. 2014;15(1):7–21. doi:10.1038/nrg3606.
Geisler S, Coller J. RNA in unexpected places: long non-coding RNA functions in diverse cellular contexts. Nat Rev Mol Cell Biol. 2013;14(11):699–712. doi:10.1038/nrm3679.
Tsai MC, Spitale RC, Chang HY. Long intergenic noncoding RNAs: new links in cancer progression. Cancer Res. 2011;71(1):3–7. doi:10.1158/0008-5472.CAN-10-2483.
Uchida S, Dimmeler S. Long noncoding RNAs in cardiovascular diseases. Circ Res. 2015;116(4):737–50. doi:10.1161/CIRCRESAHA.116.302521.
Kumar V, Westra HJ, Karjalainen J, Zhernakova DV, Esko T, Hrdlickova B, et al. Human disease-associated genetic variation impacts large intergenic non-coding RNA expression. PLoS Genet. 2013;9(1):e1003201. doi:10.1371/journal.pgen.1003201.
Ji P, Diederichs S, Wang W, Boing S, Metzger R, Schneider PM, et al. MALAT-1, a novel noncoding RNA, and thymosin beta4 predict metastasis and survival in early-stage non-small cell lung cancer. Oncogene. 2003;22(39):8031–41. doi:10.1038/sj.onc.1206928.
Gupta RA, Shah N, Wang KC, Kim J, Horlings HM, Wong DJ, et al. Long non-coding RNA HOTAIR reprograms chromatin state to promote cancer metastasis. Nature. 2010;464(7291):1071–6. doi:10.1038/nature08975.
Hauptman N, Glavac D. Long non-coding RNA in cancer. Int J Mol Sci. 2013;14(3):4655–69. doi:10.3390/ijms14034655.
Zhou Y, Zhang X, Klibanski A. MEG3 noncoding RNA: a tumor suppressor. J Mol Endocrinol. 2012;48(3):R45–53. doi:10.1530/JME-12-0008.
Mourtada-Maarabouni M, Pickard MR, Hedge VL, Farzaneh F, Williams GT. GAS5, a non-protein-coding RNA, controls apoptosis and is downregulated in breast cancer. Oncogene. 2009;28(2):195–208. doi:10.1038/onc.2008.373.
Huarte M, Guttman M, Feldser D, Garber M, Koziol MJ, Kenzelmann-Broz D, et al. A large intergenic noncoding RNA induced by p53 mediates global gene repression in the p53 response. Cell. 2010;142(3):409–19. doi:10.1016/j.cell.2010.06.040.
Cheetham SW, Gruhl F, Mattick JS, Dinger ME. Long noncoding RNAs and the genetics of cancer. Br J Cancer. 2013;108(12):2419–25. doi:10.1038/bjc.2013.233.
Cabili MN, Trapnell C, Goff L, Koziol M, Tazon-Vega B, Regev A, et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 2011;25(18):1915–27. doi:10.1101/gad.17446611.
Zhang XQ, Sun S, Lam KF, Kiang KM, Pu JK, Ho AS, et al. A long non-coding RNA signature in glioblastoma multiforme predicts survival. Neurobiol Dis. 2013;58:123–31. doi:10.1016/j.nbd.2013.05.011.
Meng J, Li P, Zhang Q, Yang Z, Fu S. A four-long non-coding RNA signature in predicting breast cancer survival. J Exp Clin Cancer Res. 2014;33:84. doi:10.1186/s13046-014-0084-7.
Li J, Chen Z, Tian L, Zhou C, He MY, Gao Y, et al. LncRNA profile study reveals a three-lncRNA signature associated with the survival of patients with oesophageal squamous cell carcinoma. Gut. 2014;63(11):1700–10. doi:10.1136/gutjnl-2013-305806.
Hu Y, Chen HY, Yu CY, Xu J, Wang JL, Qian J, et al. A long non-coding RNA signature to improve prognosis prediction of colorectal cancer. Oncotarget. 2014;5(8):2230–42.
Zhou M, Guo M, He D, Wang X, Cui Y, Yang H, et al. A potential signature of eight long non-coding RNAs predicts survival in patients with non-small cell lung cancer. J Transl Med. 2015;13:231. doi:10.1186/s12967-015-0556-3.
Zhuang W, Ge X, Yang S, Huang M, Zhuang W, Chen P, et al. Upregulation of lncRNA MEG3 Promotes Osteogenic Differentiation of Mesenchymal Stem Cells From Multiple Myeloma Patients By Targeting BMP4 Transcription. Stem Cells. 2015;33(6):1985–97. doi:10.1002/stem.1989.
Cho SF, Chang YC, Chang CS, Lin SF, Liu YC, Hsiao HH, et al. MALAT1 long non-coding RNA is overexpressed in multiple myeloma and may serve as a marker to predict disease progression. BMC Cancer. 2014;14:809. doi:10.1186/1471-2407-14-809.
Shi L, Campbell G, Jones WD, Campagne F, Wen Z, Walker SJ, et al. The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Nat Biotechnol. 2010;28(8):827–38. doi:10.1038/nbt.1665.
Heuck CJ, Qu P, van Rhee F, Waheed S, Usmani SZ, Epstein J, et al. Five gene probes carry most of the discriminatory power of the 70-gene risk model in multiple myeloma. Leukemia. 2014;28(12):2410–3. doi:10.1038/leu.2014.232.
Mulligan G, Mitsiades C, Bryant B, Zhan F, Chng WJ, Roels S, et al. Gene expression profiling and correlation with outcome in clinical trials of the proteasome inhibitor bortezomib. Blood. 2007;109(8):3177–88. doi:10.1182/blood-2006-09-044974.
Du Z, Fei T, Verhaak RG, Su Z, Zhang Y, Brown M, et al. Integrative genomic analyses reveal clinically relevant long noncoding RNAs in human cancer. Nat Struct Mol Biol. 2013;20(7):908–13. doi:10.1038/nsmb.2591.
Jiang H, Wong WH. SeqMap: mapping massive amount of oligonucleotides to the genome. Bioinformatics. 2008;24(20):2395–6. doi:10.1093/bioinformatics/btn429.
Harrow J, Frankish A, Gonzalez JM, Tapanari E, Diekhans M, Kokocinski F, et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 2012;22(9):1760–74. doi:10.1101/gr.135350.111.
da Huang W, Sherman BT, Lempicki RA. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 2009;37(1):1–13. doi:10.1093/nar/gkn923.
da Huang W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009;4(1):44–57. doi:10.1038/nprot.2008.211.
Shaughnessy Jr JD, Zhan F, Burington BE, Huang Y, Colla S, Hanamura I, et al. A validated gene expression model of high-risk multiple myeloma is defined by deregulated expression of genes mapping to chromosome 1. Blood. 2007;109(6):2276–84. doi:10.1182/blood-2006-07-038430.
An O, Pendino V, D’Antonio M, Ratti E, Gentilini M, Ciccarelli FD. NCG 4.0: the network of cancer genes in the era of massive mutational screenings of cancer genomes. Database (Oxford). 2014;2014:bau015. doi:10.1093/database/bau015.
Bezieau S, Devilder MC, Avet-Loiseau H, Mellerin MP, Puthier D, Pennarun E, et al. High incidence of N and K-Ras activating mutations in multiple myeloma and primary plasma cell leukemia at diagnosis. Hum Mutat. 2001;18(3):212–24. doi:10.1002/humu.1177.
Prideaux SM, Conway O’Brien E, Chevassut TJ. The genetic architecture of multiple myeloma. Adv Hematol. 2014;2014:864058. doi:10.1155/2014/864058.
Morgan GJ, Walker BA, Davies FE. The genetic architecture of multiple myeloma. Nat Rev Cancer. 2012;12(5):335–48. doi:10.1038/nrc3257.
Greipp PR, San Miguel J, Durie BG, Crowley JJ, Barlogie B, Blade J, et al. International staging system for multiple myeloma. J Clin Oncol. 2005;23(15):3412–20. doi:10.1200/JCO.2005.04.242.
Shaughnessy J, Jacobson J, Sawyer J, McCoy J, Fassas A, Zhan F, et al. Continuous absence of metaphase-defined cytogenetic abnormalities, especially of chromosome 13 and hypodiploidy, ensures long-term survival in multiple myeloma treated with Total Therapy I: interpretation in the context of global gene expression. Blood. 2003;101(10):3849–56. doi:10.1182/blood-2002-09-2873.
Fonseca R, Blood E, Rue M, Harrington D, Oken MM, Kyle RA, et al. Clinical and biologic implications of recurrent genomic aberrations in myeloma. Blood. 2003;101(11):4569–75. doi:10.1182/blood-2002-10-3017.
Baxter C. Molecular signatures. Nat Rev Genet. 2003;4:84.
Sotiriou C, Piccart MJ. Taking gene-expression profiling to the clinic: when will molecular signatures become relevant to patient care? Nat Rev Cancer. 2007;7:545–53.
Kuiper R, Broyl A, de Knegt Y, van Vliet MH, van Beers EH, van der Holt B, et al. A gene expression signature for high-risk multiple myeloma. Leukemia. 2012;26(11):2406–13. doi:10.1038/leu.2012.127.
This work was supported by the National Natural Science Foundation of China (Grant No. 61403111), China Postdoctoral Science Foundation (Grant No. 2014 M551268) and Postdoctoral Foundation of Heilongjiang Province (Grant No. LBH-Z14212).
The authors declare that they have no competing interests.
JS and MZ conceived and designed the experiments. MZ, HZ, ZW, LC, LY, HS and HY analyzed data. MZ and JS wrote this manuscript. All authors read and approved the final manuscript.
Meng Zhou, Hengqiang Zhao and Zhenzhen Wang contributed equally to this work.
Clinical and pathological characteristics of patients with MM in our study. (DOC 36 kb)
List of mRNAs co-expressed with at least one of the four prognostic lncRNAs. (XLS 131 kb)
List of enriched GO terms and KEGG pathways of mRNAs co-expressed with prognostic lncRNAs. (XLS 72 kb)
List of known cancer genes recorded in NCG database which were co-expressed with four prognostic lncRNAs. (XLSX 13 kb)