Abstract
Introduction
Drought is a severe environment stressor that affects both the productivity and growth of plants. 1 Drought tolerance is a complex trait, and the inability to appropriately respond to drought stress generally leads to loss of water in plant cells, decreasing turgor pressure, and a significant rise in the osmolarity of leaf sap in affected plants. 2 Plants have evolved to adapt to drought stress using multiple strategies including changes of various morphological and physiological traits: leaf structure, root morphology, extension growth, etc.3,4 Previous studies reported that drought tolerance is associated with hundreds of genes and pathways, including regulation of the hormone abscisic acid (ABA), signal transduction, protein metabolism, and carbohydrate metabolism.4,5 Dehydration and water stress usually stimulate the production of ABA, which further induces various genes and pathways. 6 Genes involved in drought stress have been identified through transcriptomic, proteomic, metabolomic, and epigenetic levels. 7 However, the molecular basis of drought stress, especially regarding signal transduction and stress adaptation, remains unclear.
By taking advantage of next-generation sequencing, studies have broadened their scope from a few genes to the whole genome and from model species to nonmodel species. Analysis of the whole transcriptome enables us to better understand expression-level changes in response to environmental perturbation. Transcriptomes of nonmodel organisms via de novo transcriptome assembly has been reported for numerous plants.8–12 Similarly, progress has been made in the study of drought resistance and drought endurance by transcriptomic analysis. In
Materials and Methods
Sample preparation and sequencing
The seeds of
For Illumina sequencing, total RNA was extracted according to Cetyltrimethyl Ammonium Bromide procedure. 19 The RNA quality and quantity were determined using an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). Fragmentation buffer (Ambion, Austin, TX, USA) was added to produce short mRNA fragments. After fragmentation, cDNA was synthesized. The cDNA library was constructed based on poly-A–enriched RNA method, and 200–300 bp fragments were selected for non-directional, 100-nucleotide, paired-end sequencing according to Illumina protocols (San Diego, CA, USA). With random hexamer primers, the first strand was synthesized and then the second strand. The short fragments were purified with the QIAquick PCR Purification kit (Qiagen, Valencia, CA, USA) for both end repair and poly-A addition reaction. Finally, the purified DNA libraries were amplified by polymerase chain reaction and sequenced by Illumina HiSeq™ 2500 (Encode Genomics Bio-Technology Company, China).
De novo assembly and clustering
In-house
Homology search and functional annotation
All unigenes were annotated using BLASTx by searching against three commonly used databases: the National centre for biotechnology information (NCBI) nonredundant (NR) database, the Swiss-Prot database, and the KOG (Eukaryotic ortholog groups) database. The cut-off
Differential gene expression analysis
All raw reads were mapped to
Single-sequence repeat mining and single-nucleotide polymorphism identification
We predicted microsatellites using MISA (MIcroSAtellite identification tool; available at http://pgrc.ipk-gatersleben.de/misa/). The criteria used for the single-sequence repeat (SSR) detection is unigenes containing 2–6 bp motifs and the minimum number of repeats are as follows: seven for dinucleotide motifs and five for trinucleotide, tetranucleotide, pentanucleotide, and hexanucleotide motifs.
We used the SOAP Program 28 to align the raw reads with the assembled reference sequence and used the SOAPsnp program to identify potential single-nucleotide polymorphisms (SNPs). To reduce the possibility of false-positive hits, we filtered out the reads with quality values <20 or read depth below 10x.
Results and Discussion
Reads generation and de novo assembly
Characteristics of clustered contigs.

length distributions of all-unigenes (A) and the characterization of unigenes against known public databases (B). The majority of contig length fall in the range of 201–600 bp. Altogether 95,586 unigenes (59.52%) had hits in the three public databases.
Characterization of unigenes
All unigenes were annotated against the NCBI NR database, Eukaryotic ortholog groups (KOG) database, and Swiss-Prot protein database. With a cutoff of
Functional classification
All unigenes were assigned with the GO classification by Blast2go, and then visualized using WEGO. A total of 9,349 unigenes were categorized into 38 functional subgroups (Supplementary Fig. S4), including Molecular Function (86,763), Biological Process (206,944), and Cellular Components (94,459). In general, the groups of “cell” (87.2%) and “cell part” (87.2%) were the most common. For the Cellular Components category, the cell and cell part were the most highly represented categories. For the Molecular Function category, genes associated with binding (45.5%) and catalytic proteins (42.3%) have the highest fractions. Under the Biological Process category, proteins related to cellular processes (66.4%) and metabolic processes (57.8%) were most frequent.
Identification of differentially expressed genes
We computed the RPKM values for each unigene using the GFOLD algorithm and detected 20,666 differentially expressed unigenes between control and drought conditions after excluding the unigenes with zero GFOLD values (Supplementary Table S1). Then, DEGs were identified with Fold Change ≥1 between two conditions and FDR ≤0.001. Altogether 1416 DEGs unigenes were identified, with 871 of them upregulated and 545 downregulated. GO enrichment was used to identify the major function groups of DEGs affected by drought stress, and 38 overrepresented groups in the GO classification were found (Fig. 2). Among those groups, we are particularly interested in the “response to stimulus” (40.3%) category under “biological process”, because the most drought-related genes may be contained in this group. Genes associated with stimulus comprised the “response to stress” and “response to abiotic stimulus” with fractions of 28.9% and 22.0%, respectively. Interestingly, we also found genes associated with “response to water deprivation” that were expressed differentially between the control and drought conditions (Fig. 3A). Those genes contained the putative conserved domains of DNA-binding domains, Annexin, NAD(P)+-dependent aldehyde dehydrogenase, lysozyme-like domain and WAX2 C-terminal.
GO classifications of differentially expressed unigenes. The differential unigenes are summarized into three main categories: biological process, cellular components, and molecular function. In total, 614 unigenes were assigned to gene ontologies.
For the KEGG analysis, 462 differentially expressed genes obtained hits in KEGG database, and those genes were associated with 140 enzyme classes and 341 KEGG pathways. These enzymes were further categorized into oxidoreductases, transferases, hydrolases, lyases, isomerases, and ligases, of which oxidoreductases (28.57%) and transferases (28.05%) were found to dominate (Supplementary Fig. S5). In addition, the top 12 KEGG pathways were shown in Figure 3B. Of the top 12 KEGG pathways, genes associated with metabolic pathways (33.24%) were most representative, followed by biosynthesis of secondary metabolites (19.2%). Those pathways include plant hormone signal transduction (ko04075), glycolysis/gluconeogenesis (ko00010), glyoxylate and dicarboxylate metabolism (ko00630), photosynthesis (ko00195), oxidative phosphorylation (ko00190), and DNA replication (ko03030).
The heatmap of genes in related to response to water deprivation (A) and The top 12 pathways assignment based on KEGG (B).
Using the criterion of the absolute value of RPKM ratio >1,000, 59 differentially expressed unigenes were retained. Among those genes, 54 showed overexpression and 5 decreased expression (Fig. 4A). Blast searches were conducted against NCBI and TAIR (Supplementary Table S2, Fig. 4B). We obtained 15 homologs from The regulations of genes expression (A) and the distribution of 59 significant differential expression genes (B). In the picture (A), different colors were used to refer the level (RPKM ratio) of DEGs. Genes with red color and yellow color upregulated and genes with purple color and blue color downregulated. Genes with significant up- or downregulation have the absolute value of RPKM ratio >2,000. Especially, genes with yellow color showed significant upregulation while genes with blue color showed significant downregulation. (Genes with zero expression are not shown in this picture due to the insignificance of ln0).
Regulation of plant hormone pathways genes
Plant hormones are known to play a vital role for plant growth and development including drought resistance. They are able to regulate many important processes, such as cell enlargement, shoot initiation, stem growth, stomata closure, and cell division (Fig. 5). Auxin, a class of plant growth hormones, promotes plant growth against a wide range of stresses including drought.29,30 In 
We also identified differentially expressed genes involved in the brassinosteroid and jasmonic acid signal transduction pathways. Some of them are upregulated under drought condition, including
Notably, plants have different modulation of hormones in response for plant abiotic stress. Gibberellin, cytokinin, ethylene, and salicylic acid are involved in regulating several aspects of plant growth and development.36–38 In the gibberellin signal transduction pathway, the
ABA, a plant growth regulator, is involved in the regulation of the final stage of plant development and accelerates the yellowing of both attached and detached leaves.
47
Its function can be enhanced by drought and could reduce stomatal closure.
48
In the early phase of drought, stomata close in response to water deficiency. But as drought conditions persist, the stoma reopens passively.
49
In this study, we found that ABA receptor family PYL is upregulated. Phosphatase 2C (
In summary, we found a number of genes associated with various hormone pathways, including
Metabolism associated drought resistance.
Under drought conditions, plant cells could accumulate a variety of small organic metabolites including sugars, amino acids, and related compounds.51,52 Soluble sugars are known to regulate osmotic pressure of plants, such as starch and sucrose.53,54 Drought exposure induces the hydrolyzation of polysaccharides and accumulation of soluble sugars and causes a decrease in leaf osmotic potential.4,55 We identified six genes associated with metabolism of starch and sucrose (Supplementary Fig. S6) in this study. Regarding sucrose degradation, two genes, encoding the β-fructofuranosidase sacA and catalyzing the conversions of sucrose-6P into α-D-Glucose-6P, increased their expression and facilitated the conversions of sucrose into β-D-Fructose. Regarding starch degradation, one unigene encoded beta-glucosidase bglX, which converts cellulose into cellobiose, and another gene encoded endoglucanase, may be involved the conversion of cellobiose into β-D-Glucose. Both of them are upregulated at the transcriptome level. In addition, genes which encoded sucrose synthase could also promote the conversion of Uridine Diphosphate-glucose into sucrose. Those results suggest that polysaccharide and disaccharide metabolism tend toward degradative processes under drought conditions to increase substance solubility and weaken osmotic pressure.
Proline can be functional as a mediator of osmotic adjustment, a stabilizer of macromolecules, a compatible solute to protect enzymes, and as a store for carbon and nitrogen for use during water-deficit and other stress regimes.58,59 In
The oxidative phosphorylation and photosynthesis
Detoxification pathways could also enhance plant drought tolerance.
62
Variation of drought tolerance under similar conditions has been reported in different plants and cultivars of crops, emphasizing the significance in drought adaptation. We found 20 genes associated with oxidative phosphorylation and photosynthesis that showed obvious changes in expression level (Fig. 6 and Supplementary Fig. S8). Seven genes associated with oxidative phosphorylation increased expression under drought treatment. Eleven photosynthesis genes demonstrated increase in expression, and two genes, encoding photosystem II oxygen-evolving enhancer protein 2 (psbP) and photosystem II PsbW protein (psbW), decreased expression. The predicted explanation was that unless simultaneous and proportionate reductions in growth and carbon consumption take place, a negative carbon balance can occur as a result of diminished photosynthetic capacity during drought. Similar to 
Under drought conditions, the fraction of carbohydrate lost through respiration determines the overall metabolic efficiency of the plant.
65
Photosynthesis fuels plant metabolism by converting the light energy into carbohydrate molecules. We hypothesized that the increased respiration and photosynthesis provide sufficient energy for growth during drought adaption and tolerance. When lacking water,
SSR mining and SNP detecting
Single sequence repeats (SSRs) identification.
Single nucleotide polymorphisms (SNPS) statistics.
Conclusions
To the best of our knowledge, this study represents the first attempt to identify possible genes associated with drought using transcriptome sequencing of
Author Contributions
Conceived and designed the experiments: BT, HM. Analyzsed the data: ZZ, YZ, YC. Contributed to the writing of the manuscript: ZZ, KL, ZX, BT, LW. Agreed with manuscript results and conclusions: BT. All authors reviewed and approved the final manuscript.
Supplementary Materials
Supplementary Table S1
Gene expression level measured by GFOLD.
Supplementary Table S2
Information of 59 significant differential expression genes.
Supplementary Table S3
Regulation of plant hormone pathways genes.
Supplementary Table S4
SSRs results.
Supplementary Table S5
SNP results.
Supplementary Figure S1
The quality distribution of two samples.
Supplementary Figure S2
Reads of filtering steps. After trimming and quality control, 56,782,314 and 46,561,748 reads were remained for each sample.
Supplementary Figure S3
Eukaryotic ortholog group (KOG) classification. In total, 24,774 sequences were grouped into 24 COG classifications.
Supplementary Figure S4
GO classifications of assembled unigenes. The unigenes are summarized into three main categories: biological process, cellular location, and molecular function. In total, 9,349 unigenes were assigned with GO number.
Supplementary Figure S5
Enzyme assignment based on KEGG pathways. Oxidoreductases and transferases were dominated.
Supplementary Figure S6
The pathway map of starch and sucrose metabolism. The identified genes involved in starch and sucrose metabolism are marked with green.56,57
Supplementary Figure S7
The pathway map of proline metabolism. The identified genes involved in proline metabolism are marked with green.56,57
Supplementary Figure S8
The pathway map of photosynthesis. The identified genes involved in photosynthesis are marked with green.56,57
