Abstract
Introduction
Lung cancers develop spontaneously with an accumulation of genetic and epigenetic changes in response to environmental factors such as tobacco smoke and air pollution, but underlying genetic factors may also play a role in disease development and progression.1–4 While cigarette smoking significantly increases the risk of developing lung cancer, up to 25% of lung cancers arise in never-smokers.3,5,6 Regardless of causal origin, lung cancers commonly exhibit non-specific symptoms, and many patients are diagnosed with advanced disease or with metastases present. 7 Despite continued efforts for early diagnosis and treatment options for lung cancer patients, the widespread incidence, poor prognosis, and staggering mortality rate remain: lung cancer is the most prevalent malignancy with the highest mortality rate worldwide with an estimated 1.8 million new cases and 1.6 million deaths in 2012. 8 Over 35% of these new cases and deaths were in China alone, where lung cancer prevails as the leading cancer in both men and women. 8
Small-cell lung cancer (SCLC) and non-small-cell lung cancer (NSCLC) are the two major forms of lung cancer. Roughly 85% of all lung cancers are NSCLCs, which comprises three major histologic subtypes: squamous-cell carcinoma (SCC), adenocarcinoma (AC), and large-cell lung cancer. Tobacco smoke, which is strongly associated with SCLC and SCC, 7 contains greater than 60 mutagens capable of binding to and chemically modifying DNA, and these changes leave characteristic mutational patterns seen in lung cancers.9,10 For example, distinctive point mutation patterns in KRAS and TP53 have been observed in lung cancer patients with a history of smoking versus their non-smoking counterparts.9,11 Compared to lung cancer in smokers, cases in never-smokers are more likely to be AC and develop in young women.12,13 Because smoking versus non-smoking lung cancer patients have distinct mutation patterns, certain drug treatments may be more effective in one group versus the other.
The various genetic and environmental factors that contribute to lung cancer vary widely, and the gene mutation profile of each tumor can be entirely unique. As such, the accumulating evidence suggests that generalized treatments for lung cancers are less effective, and individualized therapies targeting specific mutations are critical for effective treatment. Personalized treatments utilize drugs specifically designed to target particular gene mutations in an individual tumor, 14 and these observed mutations can determine which drug regimen to implement. For example, patients with EGFR mutations, particularly non-smoking women with advanced NSCLC, are commonly treated with erlotinib, which blocks EGFR signaling and slows lung cancer progression.15,16 Additionally, drugs have been developed to target VEGF mutations and an ALK/EML4 fusion. 17 Clinical trials have also shown that a combination of chemotherapeutics and drugs targeting specific mutations can work synergistically and specifically to provide patient benefits greater than any single treatment.14,18
A critical step in directing lung cancer treatments is identifying genetic alterations in the tumor. Currently, different clinical methods are used to detect gene mutations in lung cancer patients, including direct polymerase chain reaction (PCR), fluorescence in situ hybridization (FISH), and immunohistochemistry (IHC), none of which has been standardized in clinical diagnostics and each has pros and cons.19,20 As an alternative to first-generation Sanger sequencing, next-generation sequencing (NGS) has become more popular to sequence the cancer genome of individual tumors, but the instruments and assays are costly with relatively lengthy run times, making these technologies impractical for widespread clinical use. Second- and third-generation sequencing platforms, such as Illumina HiSeq and MiSeq, 454 pyrosequencing, Helicos HeliScope, SOLiD sequencing, and Ion Torrent sequencing,21–23 are facilitating the advancement of personalized cancer treatments by allowing for cost- and time-effective high-throughput screening and sequencing.24,25 Specifically, the Ion Torrent platform has further revolutionized NGS through the use of post-light sequencing technology, which utilizes standard DNA polymerase sequencing with unmodified dNTPs and a hypersensitive ion sensor to detect hydrogen ions released as each nucleotide is incorporated into the growing complementary DNA strand. 26 This innovative method circumvents much of the cost and complexity associated with the four-color optical detection system used in the other aforementioned NGS platforms, helping to further make personalized cancer sequencing and treatments a possibility in the near future.25,27
To investigate the feasibility of using Ion Torrent sequencing to reliably detect mutations in individual lung cancer samples of different types, we have used Ion Torrent sequencing with the Ion Personal Genome Machine (PGM) and Ion Torrent AmpliSeq Cancer Panel to analyze 48 lung cancer samples from Chinese patients and identify genetic mutations at 737 loci from 45 known cancer-related genes and oncogenes. This study demonstrates the feasibility of using the Ion AmpliSeq Cancer Panel to efficiently identify genetic mutations in individual tumors to potentially direct targeted therapies in lung cancer patients.
Materials and Methods
Ethics Statement
The study has been approved by the Human Research Ethics Committee of the China-Japan Friendship Hospital, China. For formalin-fixed, paraffin-embedded (FFPE) tumor samples from the tumor tissue bank at the Department of Pathology of the hospital, the Institutional Ethics Committee waived the need for IRB consent as all samples and medical data used in this study have been irreversibly anonymized.
Sample DNA Preparation
The 48 lung cancer samples used in the study were collected from the China-Japan Friendship Hospital, China. Paraffin sections (3–5 μm thick) extracted from FFPE samples were deparaffinized in xylene, and then DNA was isolated using the QIAamp DNA Mini Kit (Qiagen) following manufacturer's instructions.
Ion Torrent PGM Library Preparation and Sequencing
The Ion AmpliSeq Library Kit 2.0 (Life Technologies; Part #4475345 Rev. A) was used to construct an Ion Torrent adapter-ligated library as per manufacturer's instructions, and the Ion PGM Sequencing 200 Kit was used for sequencing reactions according to the recommended protocol (Part #4474004 Rev. B), detailed in our previous publications.28,29
Variant Calling
The Ion Torrent platform-specific pipeline software Torrent Suite was used to initially process data from the PGM runs and generate sequence reads, trim adapter sequences, filter, and remove poor signal profile reads. Torrent Suite Software v3.4 with a plug-in variant caller v3.4 generated initial variant calling from the Ion AmpliSeq sequencing data. Several subsequent filtering steps were used to eliminate erroneous base calling and to generate final variant calling: the first filter was fixed at an average total coverage depth >100, each variant coverage >20, a variant frequency of each sample >5%, and

Sequence read distribution across 189 amplicons generated from 48 FFPE specimens, normalized to 300,000 reads per sample. (
Somatic Mutations
To distinguish somatic and germline mutations, our detected mutations were compared to variants in the 1000 Genomes Project 30 and 6,500 exomes of the National Heart, Lung, and Blood Institute's Exome Sequencing Project. 31
Bioinformatical and Experimental Validation
We used COSMIC 32 (version 64), My Cancer Genome database (http://www.mycancergenome.org/), and other publications to assess reappearing mutations in lung cancer (see Supplementary Table 1). Additionally, the accuracy of the Ion Torrent PGM was compared to the Sanger sequencing method. Because DNA from the 48 experimental samples was limited, we used a trial of an additional 60 negative and 62 positive FFPE lung cancer samples that were obtained from the tumor tissue bank at the Department of Pathology of the China-Japan Friendship Hospital, China.
Clinical features of 48 lung cancer patients.
Statistical Analysis
Odds ratios (ORs) of samples with mutations and without mutations for smoking versus non-smoking patients were determined using 2 x 2 contingency tables, and the Fisher's exact test was used to calculate
Results and Discussion
Ion Torrent versus Sanger Sequencing Experimental Validation
For experimental validation of the Ion Torrent PGM, additional FFPE lung cancer samples were used, and only common mutations in exons 19 and 21 of EGFR were sequenced. All positive Sanger samples generated positive data from the Ion Torrent PGM, and only one sample generated negative data with Sanger sequencing and positive data from the Ion Torrent PGM for EGFR exon 21 mutations (Supplementary Figure 2 and Supplementary Table 2). This discrepant sample had a variant frequency of 5.59%, indicating that this may have actually been a false negative in Sanger sequencing as opposed to a false positive in Ion Torrent sequencing. Sanger sequencing has been shown to miss mutations when the allele frequency of the mutation is lower than 10%, 33 whereas the Ion Torrent PGM has been shown sensitive enough to detect variant frequencies of 5%. 34 The greater sensitivity has important clinical implications where tumor samples may be a homogenous mixture of normal and cancerous cells.

Summary of mutated genes detected in 48 lung cancer samples. A total of 26 samples harbor mutations in EGFR, TP53, KRAS, PIK3CA, CDKN2A, and CTNNB1. Samples are classified by four methods: pathologic type (AC, SCC, others), differentiation (high, middle, low, unknown), smoking history (heavy smoker, light smoker, non-smoker), and sex (male or female). Frequencies of mutations per gene are represented by blue bar graphs.
Mutation frequencies in 48 lung cancer samples based on sex, pathologic type, and smoking history.
Sequence Coverage in 48 Lung Cancer Samples
The mean read length of each sequence read was 80 bp, and the average sequence per sample was approximately 23 Mb. With normalization to 300,000 reads per specimen, there was an average of 1,639 reads per amplicon (range: 59–3,504) (Fig. 1A), where 181/189 (95.8%) amplicons averaged at least 100 reads, and 171/189 (90.5%) amplicons averaged at least 300 reads (Fig. 1B).
Lung Cancer Patients
The average age of all 48 lung cancer patients included in the study was 62.7 years, with a range of 42–78 years (SD ±8.6 years). Lung cancer samples were divided into three pathologic subtypes: AC (
Gene Mutations in Lung Cancer Subtypes
From the 45 genes screened in our study, a total of 35 mutations were identified in EGFR, TP53, KRAS, PIK3CA, CDKN2A, and CTNNB1, and these were detected in 26 of the 48 samples (51.2%) (Fig. 2 and Tables 2 and 3). A total of 15 (68.2%) AC samples contained at least one mutation, and 13 (86.7%) of these AC samples with mutations were from never-smokers (OR: 0.115;
Single mutations and patient characteristics from 48 lung cancer samples.
Nonsense mutation resulting in a stop codon.
Combination mutations and patient characteristics from 48 lung cancer samples.
Nonsense mutation resulting in a stop codon.
Lung cancer, like other cancers, develops through an accumulation of genetic changes that affect different signaling pathways and hinder normal functions, including cell growth, survival, proliferation, and apoptosis. In our study, differences in signaling pathway disruption can be seen between AC and SCC in the EGFR pathway (EGFR, PIK3CA, and KRAS), tumor suppressor pathways (TP53 and CDKN2A), and Wnt pathway (CTNNB1) (Fig. 3). All the genes identified to be mutated in our study have previously been classified as driver mutations, 35 for mutations in these genes can promote or drive tumorigenesis by conferring a selective growth advantage to the cells with these mutations. The number of driver mutations differs by patient and cancer type, where some may have few and some many. Tumors of various cancer types with only one driver mutation tend to have this mutation in an oncogene, while tumors with more driver mutations tend to have a combination of oncogene and tumor suppressor gene mutations. 35 Accordingly, in our study, the majority (72.2%) of samples with one mutation harbored the mutation in an oncogene (CTNNB1, EGFR, KRAS, or PIK3CA) (Table 3), and 37.5% of samples with two or more mutations revealed combination mutations in both oncogenes and the tumor suppressor gene TP53 (Table 4).

Mutated signaling pathways in SCC (
EGFR Mutations
Mutations in EGFR is one of the most common genetic alterations found in NSCLCs.
36
Roughly 34% of all lung ACs contain EGFR mutations,
37
and these mutations are more common in non-smoking and Asian populations, with some studies reporting a frequency of 50% or higher.38–40 EGFR mutations are much less common in SCCs, and are found in only 6% of these tumors.
37
Accordingly, we identified 13 AC samples (59.1%) and 1 SCC (4.5%) sample with EGFR mutations, and 3 (13.6%) of these AC samples contained two EGFR mutations. We found EGFR mutations to be significantly associated with AC versus SCC (
An EGFR mutation at the tyrosine kinase domain leads to constitutive activation of kinase activity and downstream signaling pathway activation, which results in increased proliferation, angiogenesis, and metastasis and a decrease in apoptosis.41,42 All the EGFR mutations we identified were in the tyrosine kinase domain localized to exon 19 (E746_A750del, L747_P753>S, L747_A750>P, and A750P) and exon 21 (L858R and L861Q), areas that are known to harbor the majority of mutations. In fact, point mutations L858R and E746_A750del comprise nearly 90% of all EGFR mutations in NSCLCs. 43 Tumors with these two mutations are dependent on EGFR signaling, and are therefore sensitive to the EGFR tyrosine kinase inhibitors (TKIs) gefitinib and erlotinib. 44 Other common EGFR mutations not identified in our study, T790M and insertions in exon 20, have been found to be nonresponsive to these TKIs.45,46
The development of EGFR TKIs has significantly improved treatment and prolonged survival for some patients with NSCLCs, and the best responses are seen in those with AC subtype, nonsmokers, younger women, and those of Asian descent.46,47 However, typical response rates to gefitinib and erlotinib are only about 10% and 12%, respectively,48,49 and clinical data show that NSCLCs eventually develop drug resistance and progress despite such treatment usually from acquired secondary EGFR mutations or other mechanisms, including KRAS and PIK3CA mutations.44,46,50 Nevertheless, identifying EGFR mutations is critical in determining the most beneficial treatments for NSCLC patients.
TP53 Mutations
TP53 mutations are prevalent genetic alterations found in many lung cancers, with up to 43% of SCCs and 35% of ACs harboring mutations in this gene. 37 Our study detected TP53 mutations in seven samples (14.6%): one AC (4.5%), four SCCs (18.2%), and two in the other lung cancer types. The frequency of TP53 mutations in our study is somewhat lower than others have reported, which may be because of the small sample size and population variations, and also the variant filter process to select for mutations already identified in the COSMIC database (Supplementary Fig. 1). All the identified TP53 mutations were found at known hotspot locations within the DNA-binding domain, including two in exon 5 (V157F and R158L), three in exon 7 (G245V, R248W, R249S), and two in exon 8 (E285K and R306*). Accordingly, most TP53 mutations cluster in the TP53 DNA-binding domain, encompassed by exons 5 through 8, and spans approximately 180 codons. 51 Previous research has shown TP53 mutations in tobacco-associated lung cancers to have distinct profiles that consist of a high proportion of G to T transversions, particularly at codons 157, 158, 179, 248, and 273, and such mutations are rarely found in lung cancers from never-smokers.52–54 While four of the seven TP53 mutations detected in our study were in fact G to T transversions, including at codons 157 and 158 in samples from smokers, two of these transversion mutations occurred in never-smokers (R219S and G245V) (Table 5). Additionally, the mutation detected at codon 248 was a transition mutation that occurred in a never-smoker with AC.
Transversion versus transition mutations in TP53.
While many TP53 missense mutations can still result in the formation of a stable protein, the mutated protein lacks DNA-binding specificity and accumulates in the nucleus. Additionally, these mutant proteins lack the ability to trans-activate downstream target genes that regulate cell cycle and apoptosis. 55 SomeTP53 mutations maylead to gain-of-function (GOF) activities in the mutant protein product, which can actively contribute to tumor progression and metastases, and can also result in increased drug resistance.56–58 Tumors containing mutant TP53 are more resistant to ionizing radiation than those with the wild-type TP53, and TP53 overexpression in NCLCs has been associated with unresponsiveness to cisplatin-based therapies.53,59 Overall, TP53 overexpression is associated with increased tumor aggressiveness, poorer patient prognosis, and shorter overall survival in both AC and SCC patients.53,60,61
KRAS Mutations
Different rates of KRAS mutations have been found in lung cancer subtypes, where an estimated 19% of AC patients harbor KRAS mutations versus only 5% of SCC patients. 37 Our study detected KRAS mutations at equal rates in ACs and SCCs (9.1%), and these samples were all from male patients with a history of smoking. Nearly all (97%) of KRAS mutations are found in the GTP binding domain of exons 2 and 3, 36 and accordingly, all mutations in our study were located in exon 2, codon 12 (G12A and G12C). The intrinsic GTPase activity of RAS is impaired by these mutations, and resistance is conferred to GTPase activators; this causes accumulation of RAS in its active GTP-bound state, thereby sustaining the activation of RAS signaling and a disconnection from upstream EGFR signaling.36,62
Because KR AS is part of the EGFR signaling pathway, constitutive activation of KRAS leads to resistance to EGFR TKIs. 63 Patients with KRAS-mutant NSCLC also lack benefits from adjuvant chemotherapy in early stages of the disease and have shown poorer clinical outcomes when treated with erlotinib and chemotherapy.61,64 Overall, NSCLC patients with KRAS mutations have worse overall survival than those with wild-type KRAS, 18 regardless of the treatment method. As they are fairly prevalent, detecting KRAS mutations in lung cancer patients prior to treatment may prevent unnecessary drug toxicities from certain drug regimens.
PIK3CA Mutations
Phosphatidylinositol-3–kinases (PI3Ks), including PIK3CA that encodes the p110a catalytic subunit, are lipid kinases critical in regulating signaling pathways and cellular functions, including cell proliferation and survival. PIK3CA mutations result in constitutive activation in EGFR signaling, and subsequent activation of downstream Akt signaling caused by these mutations interferes with other signaling pathways and contributes to oncogenicity.65,66
Commonly found in many cancer types, PIK3CA mutations are present in roughly 6% of SCCs and 4% of ACs.37,65 In our study, 5 of the 48 samples (10.4%) harbored a PIK3CA mutation either in exon 1 (R88Q), in the helical domain of exon 9 (E542K and E545K), or in the kinase domain of exon 20 (H1047L). Four of these mutations occurred in male SCC patients with a history of smoking, whereas one mutation (H1047L) was from a female never-smoker with AC and a co-occurring EGFR mutation, which was most likely the driver mutation. Accordingly, others have found a higher rate of PIK3CA mutations in SCCs, and approximately two-thirds of all PIK3CA mutations are found primarily at codons 542, 545, and 1047.66,67
PIK3CA mutations have been associated with faster disease progression and worse overall survival, and some clinical studies have found PIK3CA mutations to lead to acquired resistance to EGFR TKIs.68,69 While PIK3CA mutations are found in a smaller subset of lung cancers compared to other genes and are therefore not routinely tested for, detection of these mutations may help guide patient treatment.
Less Frequent Mutations
One SCC sample from a male with a history of heavy smoking contained a mutation (E69*) in the tumor suppressor gene CDKN2A, which plays a critical role in regulating cell cycle and downstream TP53. 70 Between 7% and 8% of lung ACs and SCCs have been found to have CDKN2A mutations, but this specific point mutation is much less common. 37 As in the sample in our study, previous studies have found a positive relationship between smoking and CDKN2A mutations.71,72
One AC sample contained a mutation in CTNNB1 (S33F), which was co-occurring with an EGFR mutation. While roughly 3% of ACs and 1% of SCCs harbor mutations in this gene, this specific point mutation in lung cancers is extremely rare. 37 The CTNNB1 gene encodes for β-catenin, a ubiquitous intracellular protein that plays a vital role in the Wnt signaling pathway. Mutations in CTNNB1 can cause accumulation of β-catenin in the nucleus and downstream target gene activation that hinders cell growth regulation and contributes to tumorigenesis. 73 Studies have found that since both Wnt and EGFR signaling can act on β-catenin, these two signaling pathways work synergistically in the process of tumorigenesis. 74 Drugs targeting CTNNB1 are currently under testing, 75 which may work in conjunction with EGFR TKIs to enhance treatment in patients with such combination mutations.
Conclusion
In the present study, we used Ion Torrent AmpliSeq Cancer Panel to sequence 737 loci from 45 cancer-related genes, mainly oncogenes and tumor suppressor genes, in 48 lung cancer samples of different pathologic types. We identified frequent mutations in EGFR, TP53, KRAS, and PIK3CA and mutations in CDKN2A and CTNNB1 at lower frequencies, and unique mutation patterns in AC versus SCC samples can be seen. While this supports previous research that different lung cancer types have distinct molecular profiles, and thus potentially different prognoses and patient outcomes, our limited sample size and low TP53 mutation rate suggest that supplementary studies with larger sample sets may be beneficial. Additionally, because lung cancers may exhibit intratumor heterogeneity,76,77 additional studies utilizing multiregion sequencing may help to more intricately define the mutation profile for these cancers and for each patient. Fortunately, the affordable cost and time efficiency of Ion Torrent sequencing may facilitate such follow-up studies and increase the availability of personalized cancer sequencing and targeted therapies in the near future.
Author Contributions
Conceived and designed the experiments: HF, XW, ZZ, SYC, DL. Analyzed the data: CT, HY, FL, DZ, SJ, HS, HD, GZ, ZL, ZD, BG, HY, CY, LW, ZS, YL. Wrote the first draft of the manuscript: LJ, VN. Contributed to the writing of the manuscript: CT, LJ, VN, XFH, SYC. Agreed with manuscript results and conclusions: HF, XW, ZZ, CT, HY, LJ, VN, FL, DZ, SJ, HS, HD, GZ, ZL, ZD, BG, HY, CY, LW, ZS, YL, XFH, SYC, DL. Jointly developed the structure and arguments for the paper: HF, XW, ZZ, CT, LJ, SYC, DL. Made critical revisions and approved the final version: CT, LJ, VN, XFH, SYC. All authors reviewed and approved the final manuscript.
supplementary Materials
Supplementary Table 1
Frequencies of recurrent mutations in lung cancer assessed with COSMIC (version 64), My Cancer Genome database (http://www.mycancergenome.org/), and additional publications.
Supplementary Table 2
Performance validation of Ion Torrent PGM compared to Sanger sequencing in 62 positive and 60 negative lung cancer samples.
Supplementary Figure 1
Filter process of variants. Note: (
Supplementary Figure 2
The accuracy trial of 60 negative samples and 62 positive lung cancer samples with Ion Torrent PGM and Sanger method for common mutations in EGFR exons 19 and 21. (
