A SNP-mediated lncRNA (LOC146880) and microRNA (miR-539-5p) interaction and its potential impact on the NSCLC risk

Background Many cancer-associated single nucleotide polymorphisms (SNPs) are located in the genomic regions of long non-coding RNAs (lncRNAs). Mechanisms of these SNPs in connection to cancer risk are not fully understood. Methods Association of SNP (rs140618127) in lncRNA LOC146880 with non-small cell lung cancer (NSCLC) was evaluated in a case-control study of 2707 individuals. The mechanism of the SNP’s biologic influence was explored with in vitro and in vivo experiments, including plasmid transfection, siRNA knockdown, flow cytometry assessment, and assays of cell proliferation, migration, invasion, and colony formation. Results Association analysis showed that A allele of SNP rs140618127 was associated with low risk of NSCLC in the Chinese population. Lab experiments indicated that SNP rs140618127 contained a binding site for miR-539-5p and the binding between miR-539-5p and LOC146880 resulted in declined phosphorylation of an oncogene, ENO1. The reduced phosphorylation of ENO1 led to decreased phosphorylation of PI3K and Akt, which is further linked to the decline in cell proliferation and tumor progression. Conclusion The study demonstrates that SNP rs140618127 in lncRNA loc146880 provides an alternate binding site for microRNA miR-539-5p which affects the phosphorylation of ENO1 and activation of the PI3K and Akt pathway.


Background
Lung cancer is the most commonly diagnosed cancer (11.6% of the total cases) and the leading cause of cancer death (18.4% of the total cancer deaths) in the world [1]. The majority of lung cancer is non-small cell lung cancer (NSCLC), which accounts for around 85% of all lung cancer cases. Genetic factors may play an important role in an individual's susceptibility to NSCLC. Long non-coding RNAs (lncRNAs) are a class of non-coding transcripts with 200 nucleotides or more. Increasing evidence suggests that lncRNAs are involved in the occurrence of lung cancer due to their functions as oncogenes or tumor suppressors [2]. Our previous studies indicated that a lncRNA on chromosome 17q24.3, named LOC146880, was expressed higher in tumor tissues than in adjacent normal tissues and high expression was associated with poor prognosis of NSCLC [3].
Single-nucleotide polymorphisms (SNPs) in the noncoding regions of the genome have been shown to affect cancer risk via regulating the transcription and/or changing the structure of lncRNA [4][5][6][7]. A previous study identified 495,729 SNPs in more than 30,000 human lncRNAs, and a large number of SNPs were predicted to have a potential impact on the microRNA (miRNA)-lncRNA interaction [8]. Here we report the identification of SNP rs140618127 in LOC146880, as a new susceptible locus to NSCLC. Bioinformatics analysis predicts that variant rs140618127 (the 'A' allele) in LOC146880 provides an altered secondary structure which may create a binding site for microRNA miR-539-5p [8], sequestering its action on other molecules. Shiraishi et al. conducted a GWAS on lung adenocarcinoma and identified SNP rs7216064 in BPTF (17q24.3) in association with the cancer risk (OR = 1.20, p = 7e-11) [9]. Seow et al. confirmed that SNP rs7216064 was associated with the risk of lung cancer based on a GWAS study of Asian female non-smokers [10]. We found that SNP rs140618127 was in strong linkage disequilibrium with SNP rs7216064 (LD; r2 > 0.80), and this lncRNA SNP was associated with lung cancer risk (OR = 0.38, p = 0.007) in our case-control study of 2707 individuals. To explore the molecular mechanism of SNP rs140618127 in NSCLC development and progression, we evaluated that the biological consequence of LOC146880 and miR-539-5p interaction, and found that the microRNA behaved like a tumor suppressor [11], which prevented LOC146880 from interacting with protein ENO1, an oncogene product [12], reducing its phosphorylation. As a result, the phosphorylation of PI3K/AKT was also reduced after the suppression of ENO1 phosphorylation [13], which further inhibited tumor growth and metastasis, leading to a better prognosis of NSCLC.

Study populations
Suspected NSCLC individuals had histopathological or cytologically confirmed diagnosis according to the World Health Organization classification. These study subjects including suspected individuals diagnosed with lung cancer or normal were recruited from the China Medical University (CMU). Distributions of the basic characteristics of the study subjects are provided in Table 1. At recruitment, an informed consent was Illuminated. Only if the subject agreed, he/she was included. This study was approved by the Institutional Review Board at CMU.

SNP selection and genotyping
SNPs with r 2 > 0.8 were considered to be in the same LD block. With this criterion, one SNP was selected in each LD block and genotyped using the TaqMan genotyping method in the ABI 7500 Real-Time PCR system (Applied Biosystems). For quality control, we implemented several measures in our genotyping assays, including 1) each plate contained both case and control samples, 2) technicians were blinded to the case/control status of the samples, 3) both positive-and negativecontrol (no DNA template) samples were included in each 384-well plate, and 4) nearly 8% of the samples were assayed in duplicate and the concordances were between 99.7 and 100%.

Cell lines
Human NSCLC cell lines (A549 and PC9) and human lung epithelial BEAS-2B cells were purchased from the Cell Bank of Type Culture Collection at the Chinese Academy of Sciences Shanghai Institute of Biochemistry and Cell Biology. These cell lines were passaged for fewer than 6 months. All the cells were tested for mycoplasma and were found to be free from infection. The cells were maintained in DMEM supplemented with 10% FBS and grown without antibiotics in an atmosphere of 5% CO2 and 99% relative humidity at 37°C.

5′ and 3′ RACE and coding prediction of LOC146880
We used 5′ and 3′ RACE to determine the transcriptional initiation and termination sites of LOC146880 with a SMARTe RACE cDNA Amplification kit (Clontech). The Alignment File of a full-length sequence of LOC146880 obtained from 5′ and 3′ RACE is available upon request.

Construction of reporter plasmids, transient transfections and luciferase assays
A reporter plasmid in the psiCHECK-2 vector (Promega) was created which contains a 1000-bp LOC146880 exon region flanking rs140618127 [G] or rs140618127 [A] with the restriction enzymes XhoI and NotI (Fermentas). A549 and PC9 cells were seeded at 1 × 10 5 cells per well in 24-well plates, and 800 ng of the reporter plasmid and 40 pmol of miR-539-5p mimic (Ambion) were cotransfected into the cells 16 h later using Lipofectamine 2000 (Invitrogen). These cells were collected 24 h after transfection. Renilla luciferase activity was measured and used to normalize the efficiency of transfection.

RNA extraction and qRT-PCR analysis
Total RNA from the NSCLC tissue specimens and cell lines used in this study was extracted using the TRIzol reagent. First-strand cDNA was synthesized using the SuperScript II reverse transcriptase kit (Invitrogen). Relative RNA levels determined by qPCR were measured on an ABI 7500 sequence detection system (Applied Biosystems) using the SYBR Green method. Βeta-actin was employed as an internal control for the quantification of LOC146880 and the mRNA levels of other genes.
For miRNA quantification, small nuclear RNA U6 was used as an endogenous control. The relative expression of RNA was calculated using the comparative Ct method.

Subcellular fractionation
Cytosolic and nuclear fractions of A549 and BEAS-2B cells were prepared and collected according to the instructions of the Nuclear/Cytoplasmic Isolation kit (Biovision). LOC146880 was mainly detected in the nuclear fraction, although it was also present in the cytoplasm (Fig. S1).

RNA pulldown and mass spectrometry analysis
RNA pulldown assays were performed following the protocol described below. Briefly, biotinylated LOC146880 or antisense LOC146880 was incubated with cellular protein extracts from A549 cells, and streptavidin beads were then added. Recovered proteins associated with LOC146880 or antisense LOC146880 were excised, and proteomics screening was accomplished by mass spectrometry analysis on a MALDI-TOF instrument (Bruker Daltonics). In vitro transcription of LOC146880 and its deletion fragments were analyzed with primers containing the T7 promoter sequence.

RNA immunoprecipitation assays
RIP experiments were performed using the Magna RIP RNA-Binding Protein Immunoprecipitation kit (Millipore). Antibodies against ENO1 (Abcam) or control proteins were diluted at 1:50. Total RNA (input control) and precipitation with the isotype control (IgG) for each antibody were assayed simultaneously. The coprecipitated RNAs were detected by RT-qPCR.

Plasmid construction and transfection
To construct a lentiviral vector expressing human LOC146880 (NR_026899), a full-length of LOC146880 cDNA containing rs140618127 [G] or rs140618127 [A] was commercially synthesized (GeneChem) and subcloned into the AgeI and NheI sites of the GV367-IRES-Puromycin lentiviral expression vector (GeneChem). To produce lentivirus containing LOC146880, 293 T cells were cotransfected with the vector described above and lentiviral vector packaging system (GeneChem) using Lipofectamine 2000. Infectious lentiviruses were collected at 48 h after transfection and filtered through 0.45-μm PVDF filters for analysis of genotype. After conformation, these lentiviruses were designated to LOC146880 . We used the GV367-IRES-Puromycin empty vector as a negative control. The virus-containing pellet was dissolved in DMEM, and aliquots of the solution were stored at − 80°C. A549 and PC9 cells were infected with concentrated virus in the presence of polybrene (Sigma-Aldrich). The supernatant was replaced with complete culture medium after 24 h, followed by selection with puromycin, and the expression of LOC146880 in infected cells was verified by qPCR.

Cell lysis and immunoprecipitation
Cells were homogenized in 1× RIPA buffer supplemented with Protease/Phosphatase Inhibitor Cocktail (Pierce). Cell lysates were centrifuged, and the supernatants were prepared for immunoblotting or immunoprecipitation with the antibodies described below. Immunoblot signal was detected using Clarity Western ECL Substrate (Thermo Fisher).   Coulter). For colony formation, 2000 cells were seeded in 65-mm culture dishes and allowed to grow until visible colonies formed in complete growth medium (2 weeks). Cell colonies were fixed with methanol, stained with crystal violet and counted. Invasion assays were performed in Millicell chambers in triplicate. The 8-μm pore inserts were coated with 30 μg of Matrigel (BD Biosciences). Cells (5 × 10 4 ) were added to the coated filters in serum-free medium. PMI-1640 medium containing 20% FBS was filled in the lower chambers as a chemo attractant. After 24 h at 37°C in an incubator supplied with 5% CO 2 , cells that migrated through the filters were fixed with methanol and stained with crystal violet. Cell numbers in three random fields were counted. The migration assay was conducted in a similar fashion without coating the filters with Matrigel.

Experiments on xenograft animals
Ten male BALB/c mice (5 weeks old) were kept in a specific pathogen-free grade environment. Tumor size was measured once every 2 days using a Vernier caliper across its two perpendicular diameters, and tumor volume was calculated using the following formula: V = 1/2*a*b 2 ; where V is the tumor volume, a is the largest diameter, and b is the smallest diameter. After 4 weeks of treatment, all mice were sacrificed, and their tumors were collected and weighed. Histological evaluation of the tumor samples was performed.

Histopathological analyses
Tumor tissues from the animals were fixed in 4% paraformaldehyde (BOSTER, Wuhan, China) for 48 h at room temperature. The fixed tissues were then dehydrated in a graded series of alcohol, cleaned in xylene, and embedded in paraffin. A rotary microtome was used to section paraffin the blocks into 4-μm thick sections. The sections were deparaffinized and stained with hematoxylin and eosin (H&E). A light microscope (Olympus) was used to examine the stained tissue sections.

Statistical analysis
The association between SNP rs140618127 and NSCLC risk was analyzed under an additive model using the unconditional logistic regression model adjusted for age, sex, and smoking status. Results of laboratory experiments were presented as Means ± SD. Student's t test was used to compare means between two groups, and ANOVA was employed for comparison of more than two groups. Repeated ANOVA was employed for comparison of more than two groups which contained repeated measure data. All the statistical analyses were performed using Statistical Product and Service Solutions (SPSS) software (version 19.0) and GraphPad Prism Version 8.0 (GraphPad Software, San Diego CA, USA).

SNP rs140618127 (G > A) in LOC146880 and NSCLC risk
SNP in LOC146880 (rs140618127) is in strong linkage disequilibrium with SNP rs7216064 (r 2 > 0.80) which is a GWAS-discovered risk allele for NSCLC. We found that SNP rs140618127 (G > A) in the exon of LOC146880 (chr17: 64758273) was associated with the risk of NSCL C; the 'A' allele, compared to 'G', had an adjusted odds ratio (OR) 0.40 (0.18-0.86) in a case-control study of 2707 subjects (Table 1). Stratified analyses suggested that this effect was more evidence in those who were ≥ 60 years old, female, and non-smokers (Supplemental Table S1). The minor allele frequency of SNP rs140618127 is low globally, < 1%, but can be high as 20% in some American populations (see Suppl. 1).

Effects of LOC146880 with rs140618127 [a] on cell proliferation and behaviors
We examined the effects of LOC146880 on cell proliferation by its allele at rs140618127, and found that overexpression of rs140618127 [A] in the NSCLC cell lines A549 and PC9 (both with the G allele at rs140618127) substantially reduced the rate of cell proliferation when compared with rs140618127 [G] (Fig. 1A). Colony formation ability in both A549 and PC9 cells was markedly suppressed by rs140618127 [A] when compared with rs140618127 [G] (Fig. 1B). Overexpression of rs140618127 [A] significantly suppressed the invasion and migration of NSCLC cells (Fig. 1C & 1D). Tumor size in a xenograft animal model of PC9 was decreased in both genotype groups, but the decline in tumor size was greater for rs140618127 [A] than for rs140618127 [G] (P < 0.05). There was no significant difference in tumor size between the vector control group and wild type, rs140618127 [G] (Fig. S2). H&E staining showed that tumors of rs140618127 [A] possessed less malignant morphology (Fig. 1E). Together, these results indicate that rs140618127 [A] can inhibit the growth of lung cancer more in vitro and in vivo compared to rs140618127 [G].

Interaction between LOC146880 and miR-539-5p
Evidence suggests that SNPs in lncRNAs may generate new interacting sites between lncRNAs and other transcripts, such as miRNAs [14]. Using an online software lncRNASNP (http://bioinfo.life.hust.edu.cn/lncRNASNP) [15], we found that several SNPs in LOC146880 were predicted to have such a possibility and SNP rs140618127 was indicated to lie within a putative binding site for miR-539-5p. The G > A mutation at rs140618127 was predicted to change the local folding structures and free energy of LOC146880 which might create a binding site for miR-539-5p. Following this prediction, we investigated whether miR-539-5p interacts with LOC146880 based on its genotype at rs140618127. Luciferase reporter assays showed that, in comparison to the construct containing rs140618127 [G], the construct with the 'A' allele had significantly reduced luciferase activity in the presence of miR-539-5p, suggesting more interaction of miR-539-5p with LOC146880 [A] than with LOC146880 [G] (Fig. 2A). The interaction between miR-539-5p and LOC146880 [A] could be blocked by the miR-539-5p inhibitor; miR-539-5p is constitutively expressed in both A549 and PC9 cells. In cells stably overexpressing LOC146880, miR-539-5p only decreased the levels of LOC146880 with rs140618127 [A], not allele G, indicating that allele A is a target of miR-539-5p (Fig.  2B).

Interaction between LOC146880 and ENO1
Using the RNA pulldown assay, we isolated a LOC146880 with rs140618127[G]-protein complex.
Mass spectrometry analysis showed that there were three proteins in this complex and the most abundant one (compared to anti-sense one) was ENO1 (Fig. 3A). We then selected ENO1 for validation, detecting ENO1 in three independent RNA pulldown assays. RNA immunoprecipitation (RIP) assays also showed enrichment of LOC146880 in the complexes precipitated with ENO1 antibody as compared with IgG or another irrelevant antibody, indicating that ENO1 may be a key target protein of LOC146880 (Fig. 3b). Next, we evaluated the consequences of the interaction between LOC146880 and ENO1. We found that ENO1 mRNA expression and protein level were not significantly different (P > 0.05, see Fig. S3) in the cells overexpressing LOC146880 with rs140618127[A] or rs140618127 [G] in the presence of miR-539-5p (Fig. 3C & 3D. However, H&E staining of xenograft tumors in mice showed that phosphorylated ENO1 was higher in rs140618127 [G] than in [A] (Fig.  3E). The expression of C-MYC, a downstream target of ENO1, was decreased remarkably when the cells were transfected with a siRNA against ENO1 (Fig. 3F).

Regulation of PI3K/AKT signal by LOC146880 via ENO1 phosphorylation
We found that ENO1 phosphorylation was markedly decreased in cells overexpressing rs140618127 [A] as compared with those overexpressing rs140618127 [G] in the presence of miR-539-5p mimics (Fig. 4A, Fig. S4, and Fig. S5). Using in-silico prediction tools [16,17], we identified a phosphorylation site at Tyr44 in the protein based on the PDB database (Fig. S5) [18]. We next investigated the impact of altered LOC146880 levels on the downstream signal of ENO1. Since our results described above indicated that LOC146880 overexpression increased cell proliferation, migration, and invasion, we focused our investigation on the PI3K/AKT-NF-kB signaling. The total amount of PI3K and AKT proteins was not significantly different between cells overexpressing rs140618127 [A] and [G]. However, we observed that protein phosphorylation levels affected the expression of downstream molecules in the PI3K/AKT signaling in A549 (Fig. S4). Cells overexpressing LOC146880 with rs140618127 [A], showed substantial decreases in NF-kB, PCNA, Vemintin, and N-cadherin levels while their β-catenin and E-cadherin levels were significantly increased when compared with the same cells overexpressing LOC146880 with rs140618127 [G] (Fig.  4B, Fig. 4C, Fig. S7, and Fig. S8). Immunohistochemical staining of xenograft tumors showed that p-PI3K, p-AKT, TWIST, N-Cadh, and SNAIL were all significantly higher in rs140618127 [G] than in [A] (Fig. 4D and Fig. S9).

Discussion
In this study, we found that SNP rs140618127 in LOC144680 contained a binding site for miR-539-5p, and the binding between miR-539-5p and LOC146880 resulted in declined phosphorylation of an oncogene, ENO1, which was found to be a downstream target of LOC146880. Furthermore, the reduced phosphorylation of ENO1 led to decreased phosphorylation of PI3K and Akt, which was linked to the decline in tumor cell proliferation and progress. The entire process of how SNP rs140618127 influences NSCLC is depicted in Fig. 5. Our case-control study supports the notion that SNP rs140618127 genotype [A] may have a protective effect on NSCLC compared to genotype [G]. In a previous study, we found that LOC146880 expression was significantly higher in NSCLC tumors than adjacent normal tissues, suggesting a possible oncogenic role for LOC146880 [3].
There has been an increasing interest in understanding the mechanisms of rare genetic variants in lncRNAs in relation to the complex traits and diseases [19,20]. Ingle et al. suggested that genetic polymorphisms in lncRNA MIR2052HG offer a pharmacogenomic basis for the response of breast cancer patients to aromatase inhibitor therapy [21]. Tang et al. indicated that SNP rs9839776 in SOX2OT was significantly associated with breast cancer possibly via influencing the expression of SOX2OT [22]. Redis et al. demonstrated that the GWAS-identified SNP rs6983267 on 8q24 is in a lncRNA gene called CCAT2 which regulates cancer cell metabolism in an allele-specific manner through binding to the cleavage factor I complex. This complex is implicated in an allele-specific regulatory mechanism of cancer metabolism orchestrated by alleles of the lncRNA [22,23]. Russell et al. identified a neuroblastoma susceptibility locus rs9295534 located in the upstream enhancer of a tumor suppressor CASC15-S. The SNP could decrease the transcriptional activity of CASC15-S and be associated with the disease outcome [24]. Wang et al. found that SNP rs965513, a locus on 9q22 in the FOXE1 gene and lnc-PTCSC2, was associated with the risk of papillary thyroid carcinoma [25].
In this study, we found that a lncRNA could regulate the function of a protein via its phosphorylation, with little influence on gene expression or protein concentration. Similar findings have been reported before in which the phosphorylation site of a protein can be blocked by a lncRNA leading to decreased phosphorylation. For example, NF-kB can be inhibited by a long noncoding RNA which directly blocks IKB phosphorylation in breast cancer [26]. LncRNA can also bind to proteins, increasing or decreasing their phosphorylation via another protein. LncRNA DANCR and PANDAR influence the phosphorylation of serine in RXRA and SFRS2 via GSK3β and P53 in breast and ovarian, respectively [27,28]. GSK3β's phosphorylation in breast cancer was reported to be reduced by lncRNA NLIPMT [29]. The phosphorylation of ULK1 can be suppressed by lncRNA HOTAIR in NSCLC [30]. LINC00675 enhances the phosphorylation of vimentin on Ser83 to suppress gastric cancer progression [31]. Our finding of a lncRNA's impact on the phosphorylation of a protein was quite unique and interesting because it is achieved by a micro-RNA through a polymorphic site in LOC146880.
In our study, we found that a G to A transition at rs140618127 in LOC146880 could turn into a binding site for a microRNA and miR-539-5p was indeed the target. Interestingly, the wildtype of LOC146880 had no  interaction with the microRNA at all. This SNP has not been reported before in any studies [32]. However, miR-539 is known to be a tumor suppressor [33,34]. Our finding of miR-539's binding to loc146880 provided new insights into a possible mechanism that explains the biologic function of miR-539 as a tumor suppressor. This effect takes place when a microRNA and lncRNA interact through a polymorphic site which results in changes in phosphorylation in a protein ENO1 that the lncRNA may target on. Low levels of LOC146880 did not influence the mRNA expression or protein levels of ENO1, but suppressed the phosphorylation of ENO1. ENO1 is a metabolic enzyme involved in the synthesis of pyruvate. It also acts as a plasminogen receptor and mediates the activation of plasmin and extracellular matrix degradation. In tumor cells, ENO1 is up-regulated and supports the Warburg effect. The protein is located on the cell surface where it promotes cancer invasion, and is subjected to substantial post-translational modifications, namely acetylation, methylation, and phosphorylation [35]. Reduced phosphorylation of ENO1 lowers the PI3K/Akt signal, which results in slower cell migration or invasion of NSCLC.
The SNP-based interaction between miRNA and lncRNA in regulation of protein function has been hypothesized and predicted by Ya-Ru et al. but few studied have provided evidence [15]. Our study was the first to show the interaction between LOC146880 and mir-539-5p in the NSCLC and to elucidate the downstream mechanism involving tumor growth and metastasis. The modulation model of the lncRNA and miRNA is not the classical competing endogenous RNAs (ceRNA). How LOC146880 interacts with ENO1 to regulate its phosphorylation and downstream signals remains to be elucidated. Although the 'A' allele of rs140618127 is low in general, some racial groups still have a relatively high frequency. In some Caucasian populations, the 'A' allele frequency is close to 20%.

Conclusions
We found in a case-control study of 2706 Chinese that SNP rs140618127 in LOC146880 was associated with the risk of NSCLC. People with the G allele of rs140618127 had higher risk than those with the A allele. Our in vitro and in vivo experiments demonstrated that LOC146880 was an oncogene and the G allele of rs140618127 had stronger oncogenic effects on lung cancer cells than the A allele in LOC146880. This differential effect appeared to come from the binding of a microRNA, miR-539, to the A allele, but not the G allele at rs140618127. The microRNA binding prevented the lncRNA's interaction with its downstream target ENO1, which led to the reduction of ENO1 phosphorylation and suppression of the PI3K/AKT signaling, resulting in lower tumor cell proliferation and less aggressive cell behaviors.