Root to shoot ratios of the Katherine and Mt. Isa populations increased while in the Petford population they decreased under stress treat ment however, these differences were not significant. RNA sequencing and differential gene expression In total, 52 million reads were generated from 12 sam ples. Reads per sample ranged from two to nine million with an average of 4 million reads per sample. Reads from high throughput sequencing were analysed with TopHat package to develop gene models. Reference guided mapping was used to predict gene models by mapping the reads against the E. grandis reference gen ome sequence without using E. grandis annotations. By using the coordinates from Inhibitors,Modulators,Libraries the predicted gene models we identified the E. grandis genes mapping to the pre dicted gene regions E. Camaldulensis.
While several of the predicted gene models map to E. grandis gene mod els there were however several predicted gene models E. Camaldulensis that did not map to E. grandis gene mod els. We used E. grandis Inhibitors,Modulators,Libraries gene names wherever the pre dicted models mapped to the E. grandis models. Where there are no E. grandis annotations mapping to the pre dicted gene models we used the gene names with a CUFF prefix. The coordinates of these genes are pre sented in Additional file 2. Reference guided transcriptome mapping Reads from all the 12 libraries Entinostat were mapped against the Eucalyptus reference genome sequence to generate gene annotations using the TopHat and Cufflinks packages. A total of 32,474 transcripts were predicted including a large number of alternatively spliced transcripts.
The identity of the transcripts was investigated by BLAST searches against the Arabidopsis protein database. This analysis revealed 15,538 unique genes from the total transcripts. Read counts mapping to the gene annota tions generated by reference guided transcriptome map ping were used for testing differential expression of the genes between control Inhibitors,Modulators,Libraries and stress treatments using the edgeR package. Before testing for differential expression, diagnostic tests were performed to test the consistency of the data between the populations. A high correlation was observed in gene expression between the three populations from a given treatment as measured by the read counts. The Pearsons correlation coefficient between the read counts of the three populations before stress treatment ranged from 0.
94 to 0. 99 and the correlation coefficient between the three populations of control plants at the end of the experiment ranged from 0. 93 to 0. 95. Similarly in the stress treatment the correlation coefficients between the populations ranged from 0. 94 to 0. 97. This is further reflected in Inhibitors,Modulators,Libraries clustering analysis. Multi dimensional scal ing plot of the count data clearly separated the 12 libraries into three groups. The six libraries from the three populations before treatment were clustered together.