Molecular markers associated with outcome and metastasis in human pancreatic cancer

Background Pancreatic ductal adenocarcinoma (PDAC) is a heterogeneous cancer in which differences in survival rates might be related to a variety in gene expression profiles. Although the molecular biology of PDAC begins to be revealed, genes or pathways that specifically drive tumour progression or metastasis are not well understood. Methods We performed microarray analyses on whole-tumour samples of 2 human PDAC subpopulations with similar clinicopathological features, but extremely distinct survival rates after potentially curative surgery, i.e. good outcome (OS and DFS > 50 months, n = 7) versus bad outcome (OS < 19 months and DFS < 7 months, n = 10). Additionally, liver- and peritoneal metastases were analysed and compared to primary cancer tissue (n = 11). Results The integrin and ephrin receptor families were upregulated in all PDAC samples, irrespective of outcome, supporting an important role of the interaction between pancreatic cancer cells and the surrounding desmoplastic reaction in tumorigenesis and cancer progression. Moreover, some components such as ITGB1 and EPHA2 were upregulated in PDAC samples with a poor outcome, Additionally, overexpression of the non-canonical Wnt/β-catenin pathway and EMT genes in PDAC samples with bad versus good outcome suggests their contribution to the invasiveness of pancreatic cancer, with β-catenin being also highly upregulated in metastatic tissue. Conclusions Components of the integrin and ephrin pathways and EMT related genes, might serve as molecular markers in pancreatic cancer as their expression seems to be related with prognosis.


Background
Pancreatic ductal adenocarcinoma (PDAC) remains a major cause of cancer related death, despite advances in surgical and medical care [1]. The majority of patients present with locally advanced or metastatic disease and die within 6-12 months. Even in the selected group of prognostic favourable localized and resectable PDAC, the 5-year overall survival (OS) is only 10-25% as the majority of patients develop disease relapse within two years after potentially curative treatment [2]. Additionally, the effect of systemic chemotherapy, either in adjuvant or in palliative setting, is low [3].
Although some parameters are described to be prognostic factors after curative surgery, such as lymph node and resection margin status, none has been consistently related to overall survival [4,5]. Moreover, even in patients with similar clinicopathological parameters, a wide range of survival rates is observed postoperatively [2]. This heterogeneous biology of pancreatic cancer and possibly related diverse response to treatment might be explained by differences in gene expression profiles. At present, molecular characteristics of PDAC carcinogenesis become gradually unravelled, but genes or pathways that specifically drive tumour progression or metastasis are not well understood [6,7]. Some studies have already linked gene expression profiles with lymph node status or advanced PDAC stage, but results are inconsistent [8][9][10]. Recently, a gene signature that subdivides PDAC in 3 subtypes was developed based on gene expression from microdissected PDAC material and cell lines. This signature would have a prognostic value and would be predictive for drug responses [11]. Microdissected material and cell lines however do not comprise the complexity of pancreatic cancer. PDAC is characterized by an abundant desmoplastic reaction that has long been ignored, but is now known to play an important role in PDAC tumorigenesis and progression [12,13].
Therefore, the aim of the present study was to define molecular characteristics related to pancreatic cancer progression, based on whole genome expression profiling of 2 human PDAC subgroups with similar clinicopathological features, but with extremely distinct survival rates after curative surgery. Additionally, we tried to gain more insight in the metastatic process of PDAC by comparing gene expression profiles of liver-and peritoneal metastases with that of primary tumour samples.

Primary PDAC and metastatic samples
Patients who underwent surgical treatment for PDAC between 1998 and 2008 were studied. Immediately after surgical removal of the resection specimen, a small part of the tissue was snap-frozen in liquid nitrogen and stored at −80°C; the other part was fixed in 6% formol and embedded in paraffin for histological examination. From patients with metastatic disease undergoing palliative surgery, core biopsies of the primary tumour and of liver (LM)/peritoneal (PM) metastases were taken and processed in a similar way. Haematoxylin-Eosin (H&E) staining was performed on each sample for histopathological confirmation according to the World Health Organization criteria. The study was approved by the KU Leuven ethical committee prior to patient recruitment, and received the study number ML3452.
Clinical and histopathological data from all patients were registered in a prospective database. Disease recurrence was defined as local or distant recurrence, diagnosed on follow-up imaging, performed routinely or because of elevated serum tumour markers.

Classification of PDAC with good or bad outcome
One hundred fifty-five patients suffering from PDAC were operated with curative intent. Postoperative followup was complete and closed in December 2011. Survival curves were determined using the Kaplan-Meier lifetable technique. The median overall (OS) and diseasefree survival (DFS) was respectively 22.3 months (95% confidence interval (CI) 18.7-29.0 m) and 12.0 months (CI: 9.0-13.3 m). None of these patients received preoperative or neo-adjuvant treatment. Postoperative chemotherapy (n = 69) or chemoradiation (n = 29) did not influence OS or DFS in this patient group. Based on cumulative OS and DFS probability plots ( Figure 1A), we defined two patient subgroups: one group with an exceptional good outcome (defined as 'Good': OS and DFS > 50 months, n = 17), and one group with an exceptional poor outcome (defined as 'Bad': OS < 19.5 months and DFS < 7 months, n = 47) ( Figure 1B).

Whole-genome expression analysis
Only representative snap-frozen PDAC material-defined as a minimum of 30% cancer cells on H&E stainingwas used for RNA extraction. In order not to exclude tumour microenvironment for gene expression analysis, samples were used without microdissection. Total RNA was extracted using Trizol (Invitrogen, Grand Island, NY) and the RNeasy mini kit (Qiagen, Venlo, The Netherlands) according to the manufacturer's guidelines. RNA concentration and purity were determined spectrophotometrically using the Nanodrop ND-1000 (Nanodrop Technologies, Wilmington, DE) and RNA integrity was assessed using a Bioanalyser 2100 (Agilent Technologies, Santa Clara, CA). Only samples with a RIN of at least 7.1 were used for further microarray analysis at the VIB Nucleomics Core (www.nucleomics.be).
Per sample, an amount of 100 ng of total RNA spiked with bacterial RNA transcript positive controls (Affymetrix) was amplified and labelled using the GeneChip 3' IVT express kit (Affymetrix). All steps were carried out according to the manufacturer's protocol. A mixture of purified and fragmented biotinylated aRNA and hybridisation controls (Affymetrix) was hybridised on Affymetrix HG U133 Plus 2.0 arrays followed by staining and washing in a GeneChip W fluidics station 450 (Affymetrix) according to the manufacturer's procedures. To assess the raw probe signal intensities, chips were scanned using a GeneChip W scanner 3000 (Affymetrix).
The RMA procedure was used to normalize data within arrays (background correction and log 2 -transformation) and between arrays (quintile normalization) (affy_1.22.0 package of Bioconductor) [14,15]. The MAS 5.0 algorithm (Microarray suite user guide, version 5; Affymetrix 2001) was used to assess detection above background. All probesets had a good signal and were used for further analysis. Four experimental designs were analysed: the effect of PDAC patients with a good outcome ('Good') versus surrounding pancreatic tissue (defined as 'control'), the effect of PDAC patients with a poor outcome ('Bad') versus surrounding pancreas, the effect of 'Bad' versus 'Good' and the effect of all PDAC samples, irrespective of outcome, versus metastatic disease in the liver or peritoneum . The limma package from Bioconductor was used to assess the contrast in each experiment [16]. Statistical significance of this contrast was tested with a moderated t-test (implemented in limma). Differentially expressed genes were defined as genes with an uncorrected p-value of p < 0.001 in combination with >2 fold-change. Classical schemes to adjust for multiple testing can result in low statistical power for microarray studies . The stringent cut-off of p < 0.001 was used as an alternative, pragmatic approach to balance the number of false positives and false negatives [17].
Metastatic samples (LM and PM) were contaminated with respectively normal liver and peritoneal tissue, reflecting in upregulation of liver-and peritoneal specific genes. Therefore only genes that were not differentially expressed between LM and PM samples, considered as metastatic specific genes, were used for analysis between primary tumour and metastatic tissue.
All gene expression data will be available from the Gene Expression Omnibus (GEO, http://www.ncbi.nlm. nih.gov/projects/geo/). Functional pathway analysis on differentially expressed probe sets was done with the Ingenuity Pathway Analysis (IPA) program (Ingenuity Systems, www.ingenuity. com; Redwood City, CA). For each experiment, probe sets with a corrected p-value <0.001 and a >2 fold change were used as input. If multiple probes referred to the same molecule, the average of the log-ratio values was taken for further analysis. Generated networks were ordered by a score meaning significance, estimated as the ratio of the number of input probes that map to the pathway divided by the total number of pathway probes. Significance of biological functions and canonical pathways were tested by the Fisher's exact test p-value after application of Benjamini-Hochberg method of multiple testing correction. Significant pathways were chosen as p < 0.05, except for the significant canonical pathways in the 'Good' versus control experiment where a more stringent p-value (p < 0.01) was chosen to eliminate possible false-positive results due to the large number of differentially expressed probe sets.
For each experiment, additional KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway analysis was performed on up-or downregulated genes (corrected p-value <0.001 and a fold change of respectively >2 and <2) using GENECODIS, a web-based tool for enrichment analysis (http://genecodis.dacya.ucm.es ) using the NCBI Entrez Gene database [18]. Two statistical tests are implemented: the hypergeometric distribution and the χ 2 test of independence. A stimulation-based correction approach is used to adjust for multiple testing.

Sample selection
Based on the definition of the 2 diverse survival outcome groups and the required RIN values above 7.1, finally 7 'Good' and 10 'Bad' patient samples with similar pathological characteristics remained available for gene expression analysis (Table 1, Figure 2). The median age was 61 and 67 years, respectively. All patients had negative resection margins on histopathological examination.
Additionally, 6 surrounding non-tumoural pancreatic control samples, 7 LM and 4 PM fulfilled the quality criteria and were used for microarray analysis.

Gene expression profiling of 'Good' PDAC versus control
Analysis of 'Good' versus control samples revealed 3265 differentially expressed probe sets, of which 2806 could be mapped to genes in the Ingenuity Knowledge Base. IPA analysis generated networks, including 'Cell morphology' , with TGFβ1 (fold 2.6, p < 0.001) central to this network. 'Cancer' , 'Cellular growth and proliferation' , 'DNA repair' , and 'Cellular movement' were differentially expressed functions. Differentially expressed canonical pathways (p < 0.01) are shown in Table 2 (Table 2).

Gene expression profiling of 'Bad' PDAC versus control
Microarray analysis comparing 'Bad' versus control samples defined 1905 differentially expressed genes. IPA analysis on 1692 mapped genes generated networks, such as the network related to 'Drug metabolism' , including TGFβ1 (fold 2.4) and LOXL2 (fold 3.9), (p < 0.001). Similar to the 'Good' versus control comparison, the functions 'Cancer' , 'Cellular growth and proliferation' and 'Cellular movement' were differentially expressed, but with even higher fold changes.   (Table 2). Genes involved in the p53 signalling pathway, the Wnt/ β-catenin and the Notch signalling were highly upregulated (Table 2) in 'Bad' PDAC samples (KEGG analysis, GENECODIS).

Molecular characteristics of 'Bad' versus 'Good' PDAC
To study gene expression profiling related to poor outcome, we first studied differentially expressed genes between 'Bad' and 'Good' PDAC samples ( Figure 3A). A total of 131 genes were differentially expressed, i.e. 69 upregulated and 62 downregulated genes in 'Bad' PDAC ( Table 3). The networks 'Cell morphology' (including SNAI2 (fold 2.9) and TGFβR1 (fold 3.3); p < 0.001), 'Cell signalling' and 'Cellular movement' were generated from differentially expressed genes (IPA). No cancer-related canonical pathways or KEGG pathways were differentially expressed between both PDAC groups. Next, we analysed differentially expressed genes between the 'Good' versus control and the 'Bad' versus control experimental designs to exclude pancreas-related genes ( Figure 3B). Only genes from the MAPK and Hedgehog signalling pathways were strongly expressed in the 'Good' samples (GENECODIS). Genes involved in Pancreatic cancer signalling pathway, p53 signalling, Wnt/β-catenin and Notch signalling were expressed in

Identification of metastasis-associated genes
After excluding liver-and peritoneum specific genes, 358 genes were differentially expressed between the primary tumour and the metastatic samples. Of these genes, 278 were upregulated in primary PDAC and 80 were upregulated in metastatic tissue. Multiple networks and functions were generated from differentially expressed genes (IPA), including 'Cancer' , 'Cell signalling' , and 'Cell cycle'. The 'Human embryonic stem cell pluripotency' and Wnt/β-catenin canonical pathways were significant. KEGG pathway analysis (GENECODIS) revealed expression of genes from the TGFβ and Wnt/βcatenin pathways in primary PDAC and expression of the TGFβ pathway-related genes in metastatic tissue (Table 4). To discover pathways potentially contributing to the metastatic process, we looked for genes upregulated in the PDAC versus control experiments ('Good' versus control and 'Bad' versus control) and in the Metastases versus PDAC comparison. In total 29 genes met these criteria, including β-catenin, ANP32A, HPGD, SET and SP1 (fold change between metastases versus PDAC respectively 3.0, 3.4, 2.5, 3.6 and 2.0; all p < 0.001) (Additional file 1: Table S1).

Discussion
Unravelling the molecular characteristics of pancreatic cancer is crucial for a better understanding of the tumour biology in order to develop novel therapeutic strategies. Correlation of gene expression profiles with patient survival might detect genes and pathways that drive PDAC invasiveness as clinicopathological parameters alone seem not sufficient to explain the variability in survival after curative resection. Therefore, in the present study, we performed whole genome expression analysis of 2 subgroups of patients with extremely diverging overall and disease-free survival rates, despite having similar clinicopathological features.
In contrast to previous studies that used microdissection or fine needle aspiration techniques to enrich the samples for neoplastic cells [11,19,20], we used wholetumour samples with the aim not to exclude the tumour micro-environment even though discrimination between tumoural and environmental RNA is technically impossible in whole-tumour samples. On the other hand, PDAC is characterized by an abundant desmoplastic stromal reaction, which plays an important role in tumorigenesis, tumour progression, and therapy resistance [12,13]. Indeed, increasingly new therapeutic regimens are studying agents that aim to target the desmoplastic stromal reaction [21][22][23]. Therefore, in order to keep the molecular information of the microenvironment but to reduce background RNA contamination, we used high-quality snap-frozen samples with a pathologically proven minimum of 30% cancer cells. This approach led to a small but still representative sample size for microarray analysis.
In our study, the Integrin and Ephrin pathways were upregulated in all PDAC samples, irrespective of outcome. These pathways were not highlighted in studies on microdissected PDAC [11]. Both pathways appear to play an important role in the interaction between cancer cells and the surrounding stroma. The Integrin family of cell adhesion receptors has been implicated in tumour progression as they contribute to the interplay between tumour and micro-environment by binding directly to components of the extracellular matrix (ECM) [24]. Due to the abundance of ECM, the integrin-mediated cell adhesion signalling may play an important role in PDAC tumour growth, migration and even in therapy resistance [25,26]. Various integrins, such as ITGA6, ITGB4 and ITGB5, are upregulated in 'Good' and/or 'Bad' PDAC samples. In cell culture studies, ITGB1 has been shown to play a critical role in pancreatic cancer progression and in metastasis in particular [27,28]. Upregulation of ITGB1 in 'Bad' PDAC, might highlight its potential therapeutical impact.
Ephrin receptors are similarly promising therapeutical targets as they mediate cell-cell interactions both in tumour cells and in the tumour micro-environment, and thereby may affect tumour growth, invasiveness, angiogenesis, and metastasis [29]. EPHA2, related to poor clinical outcome in PDAC, has already been successfully investigated as target in PDAC cell lines [30,31]. Indeed, in our study, EPHA2 was highly upregulated as PDAC with poor outcome, supporting its potential clinical relevance. Embryonic signalling pathways are known to play a role in both the tumoural and the stromal compartment and in different stages of PDAC [32]. Hedgehog signalling (Shh) e.g. has been implicated in the initiation of PDAC, and was overexpressed in PDAC samples with good overall survival in our series [33,34]. The Wnt/βcatenin pathway seems to be involved in a later stage of PDAC tumorigenesis [9,34,35]. In our study, elements from the canonical Wnt/β-catenin pathway were upregulated in all PDAC samples. However, in patients with poor survival, genes from both the canonical and noncanonical pathway, including Wnt5A and DVL1, were upregulated [35,36]. The expression of Wnt5A has already been shown to be induced in PSC [35]. Upregulation of DKK1, a Wnt/β-catenin pathway antagonist, may promote tumour invasiveness though the exact mechanism is yet unknown [37].
Overexpression of Notch signalling in PDAC correlates with tumour proliferation and migration [38]. Notch has been shown to regulate pancreatic cancer stem cells and would have a role in the acquisition of epithelial-mesenchymal transition (EMT) by inducing SNAI2 expression due to JAG1 overexpression [39,40]. Although JAG1 was upregulated in all our PDAC samples irrespective of survival, SNAI2 was upregulated in the 'Bad' versus 'Good' PDAC samples. The upregulation of many EMT-related genes, such as TGFβR1, FGFBP1, TGFβ1, LOXL2, TWIST1 and Wnt5A, and the downregulation of FOXA1 in the 'Bad' PDAC samples might support the role of EMT in the aggressiveness of PDAC [41]. Additionally, upregulation of MALAT1 in the 'Bad' samples may suggest this gene to be further explored as it is upregulated in many other tumours too and associated with cancer metastasis and recurrence [42,43].
Finally we identified a PDAC metastasis-related genetic profile containing 358 differentially expressed genes between the primary tumour and metastatic tissue. Molecular knowledge on the metastatic process in PDAC is currently lacking and the published data are inconsistent [9,[44][45][46]. Moreover, the majority of studies are based on cell lines, xenograft models and rapid autopsy material. In the current study, we used fresh human samples of both liver and peritoneal metastases. In order to focus on metastasis-specific genes, we excluded tissue-associated genes, i.e. genes that were differentially expressed between liver and peritoneal tissue samples. However, in this way, we might also have excluded metastasis-specific genes. In our study, 358 genes were differentially expressed, including genes related to the Wnt/β-catenin pathway and the TGFβ pathway. Comparing our differentially expressed genes with metastatic genes described in other studies, only 7 genes overlapped (COMP, PCDH7, PTP4A1, CXCR4, NR4A3, ANGPT1 and TIMP3) [9,[44][45][46][47]. A total of 29 genes were upregulated in metastases as compared to primary PDAC and control samples. One of these genes, β-catenin, may deserve further study because of several reasons. β-catenin has a role in tumorigenesis as an essential transcriptional co-activator in the canonical Wnt pathway, but it also plays a critical role in cadherinbased cell-cell adhesion [48]. β-catenin seems also to be a major determinant in EMT and in the reverse mesenchymal to epithelial transition (MET), necessary for cells to home in distant organs. Furthermore, β-catenin mediates transcription of MMP that degrade the ECM [49]. Our results support further investigation of its role in PDAC progression. Another gene, SP1 is linked with STAT3 and hence would regulate metastasis [50].
Limitations of the current study are the rather small sample size and the lack of clinical validation of our findings. These 2 concerns however, seem hard to overcome since PDAC is a rare disease of which good quality tissue is difficult to obtain. Additionally, PDAC has an abundant desmoplastic reaction that is overwhelmingly represented as compared to cancer cells, making many human tissue samples not representative. Microdissection of cancer cells might be an alternative to study PDAC, although this technique has its own inherent limitations, such as its technical difficulty and consequently its time-consuming activity, and the problem of RNA degradation [51]. Moreover, we believe that the only way to study human PDAC as a whole entity is to include its microenvironment in the analyses, especially since the latter has been shown to play a crucial role in tumour invasiveness and progression. The data from our current study might therefore provide valuable results with respect to gene expression and pathways involved in PDAC. Nonetheless, before these genes or pathways might be used as potential therapeutic targets in clinical setting, they need to be validated first either in a large number of human PDAC samples or in preclinical animal experiments.

Conclusion
The Integrin and Ephrin pathways seem to play an important role in pancreatic carcinogenesis and progression, including ITGB1 and EPHA2 as most important players. The Wnt/β-catenin pathway and EMT might additionally contribute to PDAC progression and metastasis, with β-catenin as a central mediator. Further validation of the role of these genes and pathways is needed.