Abstract
Introduction
Overwhelming amount of information is appearing in the literature on genetic alterations associated to invasive colorectal cancer. 1 It is so far unclear to what extent such findings are primary causes of neoplastic transformation and tumor progression or may rather represent events secondary to genetic instability. Unfortunately, it would require hundred thousands of patients with defined colon cancer disease and controlled follow up to discriminate and validate both genetic and epigenetic information by traditional multivariate analyses.2–5 This fact became evident to us in previous evaluations of results based on genome wide DNA alterations in progressive colon cancer based on BAC CGH analyses in patients with different survival, 6 as also emphasized by others. 7 Thus, it seems practically impossible to rank appearing DNA sequence alterations in relationship to progressive disease and clinical outcome, accounting for defined and undefined standard elements including epigenetics..3,4 A major part of correlates and relationships may after all only represent indirect or secondary phenomena to underlying critical cellular events despite sufficient statistical power or information on complete genome wide alterations..8,9 Therefore, simplistic models are required as alternatives to traditional statistics in order to efficiently screen for and suggest candidate DNA regions of primary importance for appearing invasive growth and subsequent progression of colorectal cancer. In line with this speculation we found it interesting to relate significant DNA copy number changes to either significantly changed gene expressions or post-transcriptional control of RNA in tumor biopsies from colon cancer; all processed from the same patients. The present study provides such in silico analyses on well defined and quality controlled tumor material from selected patients with colorectal cancer of Dukes A, B, C and D tumor stage as surrogate markers for clinical outcome, in order to filter genes within regions with copy number gain and loss by statistical modeling in limited number of patients. 1
Materials and Methods
Patients and clinical details
Intentionally, the patient material comprised a limited number of patients (n = 24) operated on for primary colon carcinoma at Uddevalla Hospital, Sweden between 2001–2003 (Table 1). These patients were selected by chance from a cohort of 486 consecutive patients with colorectal cancer to represent 6 patients, with tumor stage Dukes A, B, C and D, respectively. (Modified Dukes A–D stages correspond to TNM I–IV in present histopathological evaluations). Dukes D tumors were all diagnosed at operations and subsequent histopathological staging. Patient selection was also dependent on the presence of a particular surgeon, patient acceptance to take part in the study, quality control of tissue extracted RNA and the absence of any pharmacological preoperative treatment deemed of importance for the investigation. Thus, none of the patients had experienced any additional specific treatment beside surgery at the time of operation. Patients with rectal and very low sigmoidal tumors were not considered for inclusion. There was no overall difference between the patients when grouped according to Dukes A, B, C and D stages, considering gender and tumor location (Table 1), but Dukes D patients were younger as also observed in the entire cohort of 486 patients (
Included patients operated on for primary colon carcinoma.
All patients were consecutively included from a large cohort selected by chance over time.
Tissue samples and extraction of DNA and RNA
Biopsies from primary tumors and normal colon tissue were collected from each patient at operation, snap frozen in liquid nitrogen and stored at –80°C. Tissue biopsies were crusched in a mortar and two aliquotes of powdered tissue were used for DNA and total RNA extraction respectively. Genomic DNA and total RNA were from the same tissue source in each patient. DNA was extracted with QIamp DNA mini kit (Qiagen) according to instructions and total RNA was extracted with mirVana total RNA isolation kit (Ambion/Applied Biosystems). All material was quantified by NanoDrop ND-1000 spectrophotometry (NanoDrop Technologies) and total RNA samples were run in Bioanalyzer (Agilent Technologies) to confirm appropriate quality. mRNA expression arrays and DNA on oligo CGH arrays were run in triplicate. MicroRNA expression arrays were run in duplicate (167 or 307 ng DNA depending on array format, 33 ng RNA and 20 ng microRNA were used from each patient). Tumor tissue comprised around 80% malignant cells. 6
CGH analysis
Genomic DNA from tumor and normal colon tissue from the 24 patients was separately pooled for analyses with 6 patients in each group according to Dukes A–D. Hybridization of tumor versus normal colon DNA was performed in competition to either 44 K Whole Human Genome oligo arrays (Design 013282, Agilent Technologies) or 4 × 44 K Whole Human Genome oligo arrays (Design 014950, Agilent Technologies). Pooled DNA (1.84 μg/array) for 44 K arrays was labeled with Agilent Genomic DNA Labeling Kit PLUS, hybridized and washed using Agilent Human Genome CGH Microarray Kit 44B and for 4 × 44 K arrays by labeling (1μg DNA/array) with Agilent Genomic DNA Labeling Kit PLUS, hybridized with Agilent Oligo aCGH Hybridization Kit and washed with Agilent Oligo Wash Buffer 1 and 2 set. All labeled samples were checked by NanoDrop spectrophotometry prior to hybridization and arrays were scanned (Agilent scanner G2565 AA, Agilent Technologies).
Analyses of scanned images from CGH two-color oligonucleotide arrays were performed in Feature Extraction 9.1.3.1 (Agilent Technologies). Feature Extraction result files were imported into the statistical language R 2.7.2 10 where both channels were normalized using median normalization implemented in the Bioconductor package 11 LIMMA. The technical replicates were averaged and then segmented by DNA copy package using the CBS algorithm with default parameter values. 12 Minimal common regions (MCR, defined in) 13 between the different Dukes types were identified using the cghMCR package. 13 Briefly, gained and lost regions were defined as segment of contiguous probes that showed log2 values above or below a cut-off level, defined as one standard deviation of the probe variation calculated from all of the arrays. The cut-off values for both gained and lost segments were estimated to 0.1 (log2), which corresponded approximately to the 20th and 80th percentiles of the segment alteration values respectively. 12
mRNA expression analysis
Total RNA from tumor and normal tissue was separately pooled as described for CGH analyses; 200 ng of pooled total RNA was labeled with Agilent Two-Color RNA Spike-In Kit (Agilent Technologies), linearly amplified and synthesized to cRNA. Labeled products were checked in a NanoDrop and further hybridized in competition to Agilents Whole Human Genome Oligo Microarrays (Design 014850) with Gene Expression Hybridization Kit (Agilent Technologies). Arrays were washed with Gene Expression Wash Buffer Kit (Agilent Technologies) and scanned (Agilent scanner, Agilent Technologies).
Analyses of scanned images from two-color mRNA expression were performed in Feature Extraction 9.1.3.1 (Agilent Technologies). Feature Extraction result files were imported into the statistical language R 2.7.2 where replicated probes were averaged.
10
Each array was then normalized using Lowess normalization implemented in the Bioconductor package LIMMA..11,14 A moderated t-statistic, based on an empirical Bayes model were calculated for each gene and the corresponding
microRNA expression analysis
Total RNA from tumor and normal colon tissue was separately pooled as described; 120 ng of pooled total RNA was labeled with Agilent Cyanine 3-pCp reagent for direct labeling by Agilent microRNA Labeling Reagent and Hybridization Kit (Agilent Technologies). Labeled products were hybridized to Agilent Human microRNA single color microarrays (G4470A, Agilent Technologies, with 470 human, 64 viral probes), washed and scanned on an Agilent scanner. Analyses of scanned images from single-color microRNA expression were performed in Feature Extraction 9.5 (Agilent Technologies). The one-channel Feature Extraction 9.5 result files were imported into R. Identical probes were averaged and the data normalized using quantile-quantile normalization implemented in the Bioconductor R-package Affy.
17
As for the mRNA expression data, a moderated t-statistic was calculated for each microRNA as well as a
Statistics and mathematical interactions
Group analyses were performed by t-testing or ANOVA and frequency analysis by χ
2
. Statistical interaction analyses (correlations, co-variations, significant alterations) were based on
Results
DNA alterations
Tumor tissue vs. normal colon tissue
Significant tumor DNA copy number changes increased with tumor progression defined as early (Dukes A plus B) versus late tumors (Dukes C plus D) (Fig. 1, Fig. 2, Table 2). Dukes A, B, C, and D tumors displayed DNA alterations in 4%, 4%, 21% and 16% respectively of the entire genome compared to normal colon tissue (
Copy number gain and loss in CGH analysis across Dukes A–D tumors compared to normal colon tissue from the same patients.
Gained or lost bases (kb) per chromosome among Dukes tumor stages were detected by DNA copy segment algorithm. Significant thresholds were specified by the 80th and 20th percentile respectively.

Genome wide overview of DNA segments with sequence variations across chromosomes in Dukes A, B, C and D. Solid lines outside or close to the confidence interval (dashed lines) suggest significant DNA sequence alterations.

DNA copy number changes on chromosome 8 in Dukes A, B, C and D tumors. Significantly altered DNA segments are indicated by solid lines. Genes with significantly altered expression are indicated by red (mRNA) and green (microRNAs). Dashed lines indicate thresholds for statistically significant DNA segment alterations.
Chromosomes 1–11, 13–18 and 20–21 showed 102 Minimal Common Regions (MCRs) in Dukes A, B, C and D tumors; 78% represented gains and 22% lost regions (not shown). These aberrations equalized 30% of the entire genome (X and Y chromosomes excluded);. 14% of aberrant bases covered by MCR regions were altered in at least 3 out of 4 Dukes groups when analyzed in iterated combinations (ABCD, ABC, ACD, or BCD). These alterations were mainly located on chromosomes 7, 13, 18 and 20. Chromosomes 13 (1 Mb) and 20 (41 Mb) showed gains in all Dukes A–D tumors; 55% of MCRs were found in Dukes A and B tumors and may be considered most relevant for carcinogenesis and early tumor progression. Overall 75% of the MCRs were found in Dukes C and D tumors (not shown).
mRNA expression
Tumor tissue vs. normal colon tissue
Distribution of genes with altered expression among Dukes A–D tumors is summarized in Figure 3b and Table 3. There was no significant relationship between the number of expressed genes and tumor progression (Fig. 3b). Six, 8, 8 and 6 percent of all genes showed significantly altered expression (FC > 1, FDR < 0.05) in tumor tissue compared to normal colon tissue in Dukes A, B, C and D respectively. Downregulation was more common than upregulation in Dukes A and B tumors (
Number of transcripts with significantly altered RNA expression in genome wide analyses of Dukes A–D tumors compared to normal colon tissue from the same patients.
Transcription was considered significantly altered with log fold change >1 and adjusted

A) Distribution of aberrant DNA copy numbers across Dukes A, B, C and D tumors (solid line), DNA loss (dashed line) and gained (semidashed line).
microRNA expression
Tumor tissue vs. normal colon tissue
There was no relationship between tumor stage and the number of differentially expressed microRNAs (Fig. 3c). Dukes A, B, C and D tumors showed 17%, 21%, 18% and 15% respectively of microRNAs with altered expression (FC > 0.5, FDR < 0.05) compared to normal colon tissue (Table 4). 173 microRNAs showed significantly altered expression in one or several combinations of Dukes stages and 55 microRNAs were altered in all Dukes groups located on chromosomes 1–9, 11, 13, 17–20 and 22. Six microRNAs showed significant changes in expression between Dukes A plus B vs. Dukes C plus D stages (Table 5).
Number of micro RNAs in tumor tissue with significantly altered expression in genome wide analyses among Dukes A-D tumors compared to normal colon tissue from the same patients.
microRNAs with significantly altered expression among early (Dukes A plus B) and late tumors (Dukes C plus D).
↑ Upregulation ↓ Downregulation—Lack of significant change in expression between tumor and normal colon mucosa.
miRÒ, The miR-Ontology Database.
Mees, ST et al. Involvement of CD40 targeting miR-224 and miR-486 on the progression of pancreatic ductal adenocarcinomas. Ann Surg Oncol 16:2339–50, 2009.
Monzo, M et al. Overlapping expression of microRNAs in human embryonic colon and colorectal cancer.
Combined statistical analyses of DNA and RNA alterations
Genome-wide interactions
Each Dukes tumor stage showed some genome wide statistical interactions between structural and transcriptional alterations (Fig. 3b), but only Dukes C and D tumors showed interactions accounting for DNA alterations that discriminated significantly between early (A plus B) and late (C plus D) tumors (Table 6). Altogether, 29% (6498/22094) of all genes had significant copy number changes or showed significantly altered expression in one or several combinations in Dukes A, B, C and D tumors. 1231 of these genes (19%, 1231/6498) showed chromosomal alterations in all four Dukes A–D stages and 406 genes (6%, 406/6498) showed combined interactions in the same direction (i.e. gain and upregulation or loss and downregulation).
Transcripts (mRNA) with significantly altered expression located within DNA segments with significant copy number change in progressive colorectal tumors (Dukes C plus D versus Dukes A plus B).
Genes with unknown function have been reported. 56
Chromosomal interactions
Chromosomes 4, 8, 13 and 20 displayed significant interactions between copy number changes and genes with significantly altered expression when isolated chromosomes were tested separately. The number of chromosomes with significant within-interactions increased with tumor progression according to Dukes stage; Dukes A showed one interaction and Dukes D 4 interactions.
23 microRNAs were located within altered DNA segments in Dukes A, B, C and D on chromosomes 1, 4, 7–9, 13, 17, 18 and 20 with 3, 3, 16 and 16 microRNAs altered in Dukes A, B, C and D respectively. One microRNA (microR-663 at 20p11.1) showed interactions with altered DNA sequences in all Dukes A–D tumor stages. All interacting microRNAs in Dukes A and B were present in Dukes C and D tumors, which imply that alterations in microRNA may be an early tumor phenomenon.
Segmental interactions
The number of significant segmental interactions increased with tumor progression as illustrated for chromosome 8 (Fig. 2). Dukes A comprised 3 segments (66 Mb), Dukes B 3 (23 Mb), Dukes C 5 (358 Mb) and Dukes D 7 segments (244 Mb) with interactions between DNA and RNA. Three segments on chromosomes 8p and 18q showed interactions between DNA segments with loss and downregulation of expression. Eight regions at chromosome 7p/q, 8q, 13q and 20p/q showed interactions between DNA segments with copy number gain and upregulation.
Genes assumed important for carcinogenesis and tumor progression
Sixteen genes with significant mathematical interaction and upregulation were found in all Dukes tumors and were all located on chromosome 20. The DNA segment covered 40 Mb on chromosome 20p11.21–20q13.33. These genes represented 0.2% of the total number of structurally altered genes on all chromosomes and may be relevant for the appearance of malignancy.
Genome wide DNA segment alterations with mathematical interaction to gene expression contained all together 41 genes with significantly altered expression in a manner that statistically discriminated between early (Dukes A plus B) versus late (Dukes C plus D) tumors (not shown); 28 of these genes were expressed in Dukes C plus D tumors and 17 in Dukes D tumors and may thus be relevant for tumor progression (Table 6). Ten of these genes (WDR67, RFXAP, RP11–50D16.3, CAB39L, THSD1, SPRY2, TGDS, CLDN10, SLC10A2, CD33L3) have been reported changed in tumor tissues, while only 2 (RP11–50D16.3, SLC10A2) have been reported to appear changed in colorectal cancer.
Discussion
Technology progress in cancer research has been extraordinary with generation of enormous amounts of information particularly related to genomic and epigenetic alterations. Therefore, it appears more or less unlikely that it is possible to describe isolated and well defined causes behind appearance of malignant transformation or progression of cancer. It is easily recognized that combined alterations in gene structure, expression and processing of genetic information and epigenetic control of regulatory elements, may represent an infinite number of alterations in ranking critical events related to clinical outcome. Therefore, in the present study we used surrogate markers for outcome such as well established Dukes tumor stage classification of colon carcinoma in purposely a small group of individuals selected by chance as applied by others, 18 since the relationship between Dukes stage and survival is well established worldwide. We combined DNA, RNA and microRNA arrays to identify tumor specific DNA copy number changes in relationship to early (Dukes A plus B) and late (Dukes C plus D) tumors. Tumor material and normal mucosa were all taken from the same individuals and genomic DNA and total RNA were processed from the same piece of tissue specimens. Statistical interaction analyses were based on DNA segments defined aberrant by DNA copy algorithm with subsequent determination of correlations to defined genes or transcripts with either significantly altered expression or content of tissue mRNA or microRNA. Pooled patient materials were intentionally used to stabilize for inter specimens variation, which enhances specificity but limits sensitivity in testing.
DNA sequence alterations in general and in early and late tumor stages agreed with our previous findings, where we used tiling BAC arrays to sub classify DNA sequence alterations in patients selected according to long and short term survival. 6 Frequent early stage DNA changes included gains on chromosome 20 and parts of chromosomes 7p and 13q and loss in parts of chromosome 18q, while late tumor stages included gains of 7p, 7q, 8q, 13q and loss of 8p, 18p and 21q, suggesting great complexity within specific chromosomes as reported by others. 1 Structural DNA and RNA alterations, interacting statistically significantly, increased from early to late tumor stages at both chromosomal, sub-chromosomal and gene levels. Also, interactions between DNA and microRNA increased significantly at gene levels in a similar way across Dukes A-D tumors. Chromosome 20 showed interaction between DNA and RNA in all Dukes A, B, C and D stages with MCR across all tumor stages. Thus, 40% of the aberrant bases in 3 out of 4 Dukes groups were located on chromosome 20, which makes it likely related to carcinogenesis and early invasiveness. 19 DNA alterations on chromosome 20 have been reported by others indicating correlations between gains and transition from colon adenoma to carcinoma.20–23 Among altered genes on chromosome 20 in the present study were AURKA and CSE1L, which were also reported by others related to colorectal cancer..23,24 Thus, our results and conclusions agreed with findings reported by others based on genomic and transcriptomic information from different sources and patients, 18 when our computations were performed on specific chromosomes. However, a different pattern appeared when early versus late tumor stages were used as covariates; then it appeared that chromosomes 13 and 18 were most important for transcriptional alterations due to changed DNA.
Copy number changes in DNA may reflect a natural adaptation of DNA to altered environmental conditions. This phenomenon may represent selections in development of life based on genetic recombination. Thus, cellular DNA that contains polymorphic regions may or may not represent future blue prints for improved functions. Based on this implication, it is not easy to judge what appearance of altered DNA sequences really imply in cells overriding contact inhibition and normal growth control including attenuated apoptosis. Such altered DNA structures may either represent appearing suitable adaptations to withstand hypoxia and other challenges; or it may only be a result of by chance events leading to further compromised cell function and growth control. A third explanation may be appearance of significant alterations without any impact at all on cell function; i.e. cells can continue to accumulate aberrant DNA as long as it does not compromise cell survival. However, DNA alterations important for carcinogenesis should be present in all subsequent tumor stages or tumor cell clones as long as the malignant cell remains. Late appearing DNA alterations may thus either imply changes determining tumor progression or simply that such changes are not destabilizing the genome too much. Therefore, a simplistic interpretation of our model approach was to discriminate and correlate early and late DNA copy number changes to statistically significant alterations in gene expression. This approach should exclude most structural DNA changes that are not translated into functional dynamics. Therefore, candidate DNA regions with interactions should contain a majority of copy number changes that could potentially influence on defined cellular functions by splicing and either increased or decreased translation. However, this simplistic approach would not identify DNA alterations that are related to as yet unconfirmed changes in gene expression. With this perspective it was also interesting to evaluate significant statistical interactions between microRNA and DNA copy number change, which may identify important interactions based on more recent dimensions of gene expression.
Genome-wide chromosomal copy number gain represented the only structural change that alone predicted progressive malignancy. Three genes showed inverse relationships between DNA structure and expression; i.e. gain and downregulation or loss and upregulation; a kind of combined alterations that make them less likely as functional adaptations. Also, we observed that altered expression in early stage tumors could disappear in later tumor stages, probably as a consequence of DNA loss. A majority of 28 genes with altered expression in Dukes C and D appeared to code for proteins in translation- and transcription control, cell transporting, membrane protein interactions and posttranslational modifications, although some genes had more or less unknown functions. Differentially expressed genes in Dukes A and B tumors did not correlate to confirmed aberrant DNA copy numbers (Table 6). A large proportion of genes with significantly altered expression and DNA interactions mapped to chromosome 13 (17/28), but 35% of these genes had unknown function. Our results indicated a clear-cut relationship between increasing number of combined genetic events (DNA and RNA or DNA and microRNA) and late Dukes stage, when we used a relative wide selection criteria for microRNA (FC < 0.5). However, as few as 6 microRNAs (including microR-602) were altered to discriminate between Dukes A and B versus Dukes C and D respectively. Only 4 microRNA genes were altered in Dukes A and B but not in C and D indicating few differences in microRNA between early and late tumor progression, although it has been reported that microR-602 and microR-373 may impact on systemic tumor spread. Accordingly, microR-373 was recently suggested a promoter of metastasis in breast cancer cells,.25,26 now with similar indication in colon cancer. Upregulation of microR-21 was reported to correlate to poor outcome in colorectal cancer patients, 27 but we did not find such implication in our present analysis accounting for tumor stage (Dukes A to D). Indeed, a lot of clinical and prognostic information appears to be confined to altered microRNAs in colon cancer, 28 but such alterations seemed indirectly less related to DNA copy number changes since similar findings occurred in embryonic and transformed cells. 29 Such observations agree with findings appearing in our present modeling. Only one of these six microRNAs (microR-486) was found to have a predicted target gene (CLDN10, Table 6, TargetScanHuman, the microR-Ontology Database), when search was performed among top 100 predicted target genes. Our observations on deregulated expressed microRNAs agreed to 70%–80% with selected sets of microRNAs from published reports.30–33
In conclusion, our present and previous observations indicate thousands of aberrant DNA copy numbers in genome wide analysis on colon cancer as expected..6,7,34 These numerous altered segments with potential importance for tumor progression were filtered by means of mathematical interaction analysis to a final group of 17 candidate genes (Dukes D) with hypothetical relevance for tumor progression. Our modeling supports that colon cancer progression is related to genomic instability accompanied by altered gene expression. However, new information is that carcinogenesis and early appearance of invasive tumor growth may rather be related to functional genomic alterations and less to DNA copy number changes. Our model may be a tool to accept or reject structural and functional genetic alterations in appearance and progression of colorectal cancer in small groups of patients.
Disclosures
This manuscript has been read and approved by all authors. This paper is unique and is not under consideration by any other publication and has not been published elsewhere. The authors and peer reviewers of this paper report no conflicts of interest. The authors confirm that they have permission to reproduce any copyrighted material.
