Genome wide analysis and clinical correlation of chromosomal and transcriptional mutations in cancers of the biliary tract

Background The pathogenesis of biliary cancers is ill-defined. This study investigates changes in gene expression and copy number in biliary cancers and correlates these changes with anatomical site of origin, histopathology and outcome. Methods We performed gene expression and CGH analysis on 34 biliary tract cancer specimens. Results were confirmed by RT-PCR. Clinical-pathologic correlation was made using functional over-representation analysis of the top 100 mutations associated with each variable. Results There were 545 genes with altered expression in extrahepatic cholangiocarcinoma, 2,354 in intrahepatic cholangiocarcinoma, and 1,281 in gallbladder cancer. Unsupervised hierarchical clustering analysis indicated there was no difference in the global gene expression patterns between each biliary cancer subgroup. CGH analysis revealed that short segments of chromosomes 1p, 3p, 6q, 8p, 9p, and 14q were commonly deleted across all cancer subtypes. Commonly amplified regions included segments of 1q, 3q, 5p, 7p, 7q, 8q, and 20q. Over-representation analysis revealed an association between altered expression of functional gene groupings and pathologic features. Conclusion This study defined regions of the genome associated with changes in DNA copy number and gene expression in specific subtypes of biliary cancers. The findings have implications for identification of therapeutic targets, screening, and prognostication.


Background
Biliary tract cancers account for approximately 10-20% of hepatobiliary neoplasms. Approximately 9,000 cases of biliary tumors are diagnosed in the USA each year. Gall-bladder carcinoma (GBC) is the most common, accounting for 60% of cases [1]. The remaining 40% are cholangiocarcinomas and are further sub-classified as intrahepatic (IHC) when they arise from intrahepatic biliary radicles or extrahepatic (EHC) when they arise from the confluence of the main left and right hepatic ducts or distal in the bile ducts. The classification of biliary tract cancers into these anatomically-based subtypes has substantial clinical relevance, as risk factors, presentation, staging, and treatment varies for each [2,3]. Regardless of subtype, most patients with carcinoma of the biliary tract present with advanced disease, with median survival of approximately one to two years from the time of diagnosis [4][5][6].
Little is known regarding the genetic alterations in the biliary epithelium that lead to cancer. Studies have shown that biliary carcinogenesis may be related in-part to loss of heterozygosity at the loci of chromosomes 1p, 6q, 9p, 16q, and 17p, and point mutations at the K-ras oncogene and the p-53 tumor suppressor gene [7,8]. Enhanced expression of VEGF in cholangiocarcinoma cells and localization of VEGF receptor-1 and receptor-2 in endothelial cells is thought to play a crucial role in tumor progression [9]. Clyclooxygenase-2 and c-erbB-2 are also overexpressed in cholangiocarcinoma [10]. In addition, interleukin-6 is important in the proliferation of malignant biliary epithelial cells [11,12]. Our recent work examining cell cycle-regulatory protein expression in biliary tract cancers revealed differentially expressed cell cycle-regulatory proteins based on tumor location and morphology, and an overlap in the pathogenesis of GBC and EHC was suggested [13].
The present study investigates alterations in gene expression and gene copy number in frozen tumor specimens from patients with GBC, IHC, and EHC. Gene expression results were correlated with comparative genomic hybridization (CGH) data by identifying transcriptional changes in the most highly unstable genomic regions. Additionally, the genetic findings were correlated with clinical disease characteristics and pathologic features.

Patients and specimens
Biliary tract cancers from 34 patients (13 IHC,12 EHC,9 GBC) were snap-frozen and stored at -80°C. In addition 9 non-cancerous gallbladders and 9 non-cancerous bile duct controls were obtained from patients who had resections for diseases not involving the gallbladder or bile duct (in these patients the gallbladder or bile duct was removed for surgical access to other hepatobiliary or pancreatic structures). Each sample was re-examined histologically using H&E-stained cryostat sections. Surrounding non-neoplastic tissue was dissected from the frozen block under 10× magnification and care was taken that at least 90% for remaining cells were cancerous. All studies were approved by the Memorial Sloan-Kettering IRB.

RNA isolation, probe preparation, and expression microarray hybridization
Total RNA was isolated from tissue using the DNA/RNA all prep kit (Qiagen, Germantown, Maryland, USA). Quality of RNA was ensured before labeling by analyzing 20-50 ng of each sample using the RNA 6000 NanoAssay and a Bioanalyzer 2100 (Agilent, Santa Clara, California, USA). Samples with a 28S/18S ribosomal peak ratio of 1.8-2.0 and a RIN number >7.0 were considered suitable for labeling. RNA from one IHC specimen, two EHC specimens, and three cases of GBC failed to meet this standard and were discarded from the gene expression analysis. For the remaining samples, 2 μg of total RNA was used for cDNA synthesis using an oligo-dT-T7 primer and the SuperScript Double-Stranded cDNA Synthesis Kit (Invitrogen, Carlsbad, California, USA). Synthesis, linear amplification, and labeling of cRNA were accomplished by in-vitro transcription using the MessageAmp aRNA Kit (Ambion, Austin, Texas, USA) and biotinylated nucleotides (Enzo Diagnostics, New York, USA). Ten micrograms of labeled and fragmented cRNA were then hybridized to the Human HG-U133A GeneChip (Affymetrix, Santa Clara, California, USA) at 45°C for 16 hours. Post hybridization staining, washing were processed according to manufacturer. Finally, chips were scanned with a high-numerical aperture and flying objective lens in the GS3000 scanner (Affymetrix). The image was quantified using GeneChip Operating Software (GCOS) 1.4 (Affymetrix).

Array CGH profiling
Genomic DNA was extracted using the DNA/RNA prep kit (Qiagen). DNA integrity was checked on a 1% agarose gel and was intact in all specimens except one case of EHC. 3 μg of DNA was then digested and labeled by random priming using RadPrime (Invitrogen) and Cy3 or Cy5-dUTP. Labeled DNA was hybridized to 244 K CGH arrays (Agilent) for 40 hours at 60°C. Slides were scanned and images quantified using Feature Extraction 9.1 (Agilent). 15 sec, 60C for 1 min). To calculate the efficiency of the PCR reaction, and to assess the sensitivity of each assay, we also performed a 7 point standard curve (5, 1.7,0.56,0.19,0.062,0.021, and 0.0069 ng). Amounts of target were interpolated from the standard curves and normalized to HPRT (Hs99999909_m1).

Data Analysis
Image files were quantified using GCOS 1.1 to generate the CEL files. These were normalized using the GC-RMA package from the Bioconductor toolkit (Bioconductor, Seattle, Washington State, USA). Expression values were log (base 2) transformed for all subsequent analysis. Unsupervised hierarchical clustering was done using a distance measure derived from the Pearson correlation (distance = (1-ρ)/2 were ρ is the correlation coefficient) and average linkage options. To determine differentially expressed genes a variant of the t-and F-tests were used as implemented in the LIMMA toolkit (Bioconductor). To account for multiple-testing the False Discovery Rate (FDR) method was used. An FDR < 0.01 was considered statistically significant. For clinicopathologic correlation, a functional over-representation analysis was done on the top 100 genes. p < 0.001 was considered significant.
For the array-CGH data, the raw images were quantified with the Agilent Feature Extraction program and normalized using a combination of intensity dependent and GCcontent dependent non-linear normalization procedure. To determine significant changes in copy number, the Circular Binary Segmentation algorithm [14] was used with alpha set to 0.001. Segments that had a log 2 ratio of intensity greater than a sample dependent threshold and a signal-to-noise ratio greater than 0.5 were considered either amplified or deleted.

Clinicopathologic Data
Frozen tissue was analyzed from 34 patients who underwent surgery for biliary tract cancers between August 1993 and December 2005. 13 patients had IHC, 12 had EHC, either at the bile duct bifurcation or in the mid or distal bile duct, and 9 patients had tumors originating within the gallbladder. Selected clinicopathologic features are shown in Table 1. The median age of patients was 64 (range 46-88) and 20 (59%) patients were female. 31 (91%) patients had margin-negative resections, two (6%) patients had margin-positive resections, and one (3%) patient underwent biopsy only.

Gene Transcriptional Alterations in Biliary Carcinomas
We analyzed alterations in gene expression in EHC, IHC, and GBC compared with non-cancerous bile duct or gallbladder controls using the Human Genome U133A Gene-Chip. Figure 1 depicts the 40 top ranking overexpressed and underexpressed genes for (a) extrahepatic cholangiocarcinoma, (b) IHC, and (c) GBC. Ranking was based on FDR values. Table 2 summarizes the extent of gene expression alterations for each type of biliary tract cancer. In the EHC specimens, differential expression was noted in 545 genes compared with 2,354 in IHC and 1,281 in GBC (See additional files 1, additional file 2, and additional file 3). There was a near equal distribution of overexpressed and underexpressed genes for each tumor type. However, higher fold changes in expression levels were seen more commonly with underexpressed genes. In particular, depending on cancer subtype, 16-22% of genes with decreased expression had greater than 10-fold changes expression levels compared with controls. Conversely, only 2-12% of genes with increased expression had alterations of 10-fold or greater (Table 2).

Comparative Analysis of Biliary Cancer Subtypes
Unsupervised hierarchical clustering analysis revealed that the three cancer subtypes did not cluster separately, implying that there was no difference in the global gene expression patterns between the biliary cancer subgroups. Figure 1d depicts the top 40 up-regulated and down-regulated genes for all cancers combined versus the 18 control specimens. However, while the individual cancer subtypes did not cluster separately, there was unique differential expression of many genes compared with normal biliary epithelium in each cancer subtypes. The relationship of gene transcriptional changes among the three biliary cancer subtypes is depicted in a Venn diagram ( Figure  1e). There was unique altered expression of 1633, 80, and 790 genes in IHC, EHC, and GBC, respectively. Overall, 165 probe sets were commonly differentially expressed in all 3 cancer types (See additional file 4). Selected commonly differentially expressed genes are listed in Table 3.

Genomic Alterations in Biliary Carcinogenesis
To better understand the molecular pathogenesis of biliary tract cancers we used an array based CGH analysis to detect chromosomal areas of DNA copy number gain (DNA copy number of 3 or greater) and loss (DNA copy  number of 0 or 1) in the GBC, IHC, and EHC specimens. other patients with the same tumor type had minimal structural changes in their entire genome (Figure 2a).
While the cumulative pattern of chromosomal alterations was highly variable, there appeared to be selected chromosomal regions that were commonly altered across all cancer subtypes. For example, a short segment of chromosome 1p was deleted in greater than 75% of patients with GBC and IHC and nearly 50% of patients with EHC. Similarly, segments of chromosomes 3p, 6q, 8p, 9p, and 14q were commonly deleted across subtypes of biliary cancers. Commonly amplified regions across cancer types include segments of 1q, 3q, 5p, 7p, 7q, 8q, and 20q (Figure 2a-e).

Analysis of Transcriptional Changes in Commonly Unstable Genomic Regions
To further elucidate the pathogenesis of biliary tract cancers, we integrated the array based CGH data with our gene expression profiling with by identifying gene expression alterations in regions of highest genomic instability. To this end, we investigated the gene expression changes in regions of the genome for which greater than 40% of patients had either chromosomal gains or losses in each cancer subtype (See additional files 5, additional file 6 and additional file 7). Selected alterations in gene expression within these unstable genomic regions are shown in Table 4. Analysis of this data reveals that, as expected, a positive correlation could be made between chromosomal deletion and the loss of gene expression. Conversely, there were no instances of increased gene transcription in regions of chromosomal deletion. However, in regions of chromosomal amplification, both increased and decreased gene transcription were seen with similar frequency.

Validation of Findings
The Affymetrix U133A gene expression array data were both internally and externally validated. First, a large number of gene transcripts were represented by more than one probe set in the array. In each case, the different probes for each detected similar expression levels of transcript (See additional files 1, additional file 2, and addi-tional file 3). This includes genes with altered expression in EHC (i.e. CDKN1C, NR4A3, RBM5, SASH1), IHC (ADH1B, GREM1, MCM4, NR4A2), and GBC (HIST2H2AA, NUSAP1 RPS10, RPS19).
In addition, to externally validate our data, selected differentially expressed genes were measured for transcript levels in biliary carcinoma specimens and in normal biliary epithelial controls using quantitative reverse transcriptase PCR. We assayed 11 genes with differing biologic functions and involvement in diverse molecular pathways but with known importance in carcinogenesis. These included genes which were overexpressed in EHC (SRDA21, STAT1, UBD, TYMS), underexpressed in EHC (FOSB, CDKN1C, IL6), overexpressed in IHC (SRDA21, STAT1, UBD, TYMS), underexpressed in IHC (DLC1, NR4A2, IL6), and overexpressed in GBC (UBD, TYMS, CDC2, CCNB2). PCR data was normalized to HPRT which was expressed at similar levels in both the cancerous and the control biliary epithelium (not shown). Results are shown in Figures  (3a-f, 4g-k) and, for each gene tested, confirm the Affymetrix U133A gene expression array data. The array-  based CGH results were internally validated by correlation of the X chromosome copy number with patient gender.

Correlation of Gene Expression Profiles with Clinicopathologic Features
To determine whether certain clinicopathologic features are associated with specific gene expression changes in biliary carcinomas, we performed over-representation analyses by determining whether certain functional gene categories were over-represented among the top 100 ranking genes (by FDR) with altered expressing in patients with specific clinicopathologic features. Altered expression of genes associated with functional categories related to ribosomal structure, cellular and protein biosynthesis and cellular metabolism were significantly associated with high grade tumors (See additional file 8). Similarly, a strong correlation could be made between vascular invasion and mutated expression of genes involved with electron transport and metabolism (See additional file 9). Perineural invasion was correlated with altered expression of genes in the functional categories associated with mitochondrial structure and electron transport (See additional file 10). There was no significant association between gene expression patterns and lymph node invasion. Similarly, we did not find a significant correlation between functional gene category over-representation and survival.

Discussion
The molecular pathogenesis of biliary tract cancers is poorly understood. By performing immunohistochemical analysis of more than 125 surgically resected cases of biliary tract carcinoma, we have previously shown altered cell cycle regulatory protein expression in biliary tact cancers [13]. Our current findings also show mutated expression of a large number of cell cycle regulators including UBD, BCL2L2, CDC2, MCM2, and CDKN1C in all subtypes. Similarly, Kang et al. [15] found that expression of G1-S modulators were commonly mutated in 42 cases of IHC. Total loss of p16, p27, and Rb were detected at rates of in 36%, 31%, 12%, respectively, in cancer specimens. Furthermore, in the above study, even in 7 of 13 cases of biliary dysplasia, without frank carcinoma, abnormal expression of p53, cyclin D1 or p16 was detected. Kim et al. [16] reported that the mutation of the p53, p16, and Kras genes occurred at rates of 36%, 31% and 20%, respectively, in GBC. A further finding of the above study was that 100% of GBCs and 80% of adenomas displayed loss of heterozygosity at a minimum of one locus which is consistent with our CGH results. Chang et al. [17] studied loss of heterozygosity in 32 cases of GBC and 11 cases of dysplasia. Loss of one allele was identified on chromosomes 5q (55%) and 17p (40%) in the dysplastic cases and on chromosomes 3p (52%), 5q (66%), 9p (52%), and 17p (58%) in the carcinomas. Loss of heterozygosity on multiple chromosomes was significantly more fre- quent in patients with metastatic disease than in cases without metastases. In the current report, we similarly found that segments of 3p and 9p were commonly deleted across all subtypes of biliary cancers. However, we additionally discovered that segments of 6q, 8p, and 14q were commonly deleted across subtypes of biliary cancers

Real-Time PCR Based Validation of Gene Expression Findings
There is increasing evidence that overexpression of tyrosine kinase growth factor receptors such as ErbB-2, epidermal growth factor receptor (EGFR), and Met play important roles in the development of biliary tract carcinomas. Nakasawa et al. [18] studied tyrosine kinase receptor proteins expression by in 221 biliary tract carcinomas and found that overexpression of ErbB-2 was found in 16% of carcinomas of the gallbladder and a slightly lower percentage of extrahepatic bile duct tumors. ErbB-2 gene amplification was present in 79% of cases. Overexpression of EGFR was found in 8% of tumors and was also associated with a high frequency of gene amplification (77%). Met overexpression was most frequent in IHC (21.4%) but was not associated with gene amplification.
Microsatellite instability also appears to be a critical factor in selected cases of biliary carcinogenesis. Roa et al. [19] performed microsatellite analysis on 59 frozen GBC specimens using 13 different markers. They found evidence of microsatellite instability in equal proportions in early and late cancers, and it was also found in premalignant lesions, indicating that inactivation of mismatch repair genes occurs early in gallbladder carcinogenesis.
In addition to finding that a large proportion of differentially expressed genes in this study involved in cell cycle regulation and apoptosis, we also discovered a disproportionate number of mutated genes that control transcriptional regulation, RNA procession, cellular signaling, or are involved with cytoskeletal structure, extracellular matrix, and cellular adhesion. Differentially expressed genes involved with transcriptional regulation include STAT1, NARG1, HOXC6, and MMP11. Important genes involved with signal transduction with altered expression include CXCL5, ECT2, GPRC5A, MELK, and CKS2. Dysregulated genes involved with cytoskeleton, extracellular matrix and cellular adhesion include ITGA7, LAMB3, CECAM5, KRT6B, and CLDN18.
The findings of the present study will serve as a resource for other investigators in this area as we have indentified many potential targets for therapeutic intervention. As an example, we found that TYMS, which encodes an enzyme that catalyzes 5-fluorouracil, was overexpressed 7.2 -26.0-fold depending on biliary cancer subtype. TYMS expression is correlated inversely with clinical response to 5-fluorouracil-based chemotherapy and the overexpression may explain the futility of 5-fluorouracil-based chemotherapy for biliary carcinomas [20].
We also found that a number of genes in the ubiquitin pathway had altered expression in each cancer subtypes. For example, more than 20 ubiquitin-related genes had significantly altered expression IHC. In GBC, UBD was overexpressed more than 200-fold and UBE2C was overexpressed nearly 15-fold. Ubiquitin and ubiquitin-like proteins are signaling messengers that regulate a variety of cellular processes including cell proliferation, cell cycle regulation, DNA repair, and apoptosis. There is accumulating evidence that deregulation of this pathway as a result of mutations or altered expression of ubiquitylating or de-ubiquitylating enzymes as well as of Ub-binding proteins affect crucial mediators of these functions and are underlie the pathogenesis of several human malignancies [21]. A variety of inhibitors of the ubiquitin system are currently being experimentally tested in clinical trials with promising early results [22]. These data suggests these inhibitors may have applicability as adjuvants in treating patients with biliary tract carcinomas.
Another promising target uncovered in this report is STAT-1 which was overexpressed nearly 9-fold in cases of cholangiocarcinoma. The Signal Transducers and Activator of Transcription (STAT) proteins regulate many aspects of cell growth, survival and differentiation. The transcription factors of this family are activated by the Janus Kinase JAK and dysregulation of this pathway has been observed in primary tumors and leads to increased angiogenesis, metastases, enhanced survival of tumors, and immunosuppression [23,24]. A number of JAK/STAT pathway inhibitors are being tested in pre-clinical studies and their application to cancers of the biliary tract may prove promising [25].

Conclusion
Both gene expression and CGH data support an overlapping pathogenetic mechanism for all subsets of biliary tract cancers. However, exceptional diversity of mutational findings between individual patient specimens is also apparent. Functional over-representation analysis revealed a significant association between altered expression of genes involved with regulation of cellular metabolism and biosynthesis and high pathologic grade. Vascular invasion was associated with mutated expression of genes involved with electron transport and cellular metabolism. CGH analysis revealed that short segments of chromosomes 1p, 3p, 6q, 8p, 9p, and 14q were commonly deleted across all cancer subtypes while commonly amplified regions included segments of 1q, 3q, 5p, 7p, 7q, 8q, and 20q. The data also offer opportunities to uncover potential targets for experimental therapeutics.