|
|
||||||||
1 Department of Biology, Graduate School of Science, Osaka University, 1-1 Machikaneyama, Toyonaka, 560-0043, Japan
2 Kobe Advanced ICT Research Center, National Institute of Information and Communications Technology, 588-2 Iwaoka, Iwaoka-cho, Nishi-ku, Kobe 651-2492, Japan
3 Graduate School of Frontier Biosciences, Osaka University, 1-3 Yamadaoka, Suita, 565-0871, Japan
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
Although the question of codon usage bias has been extensively studied in many organisms, including humans as well as human infectious viruses (e.g. Plotkin & Dushoff 2003; Meintjes & Rodrigo 2005; Ren et al. 2007; van Hemert et al. 2007), unicellular organisms can provide a simpler model system for examining whether codon usage bias is related to cellular metabolism. In Escherichia coli, clear evidence has been obtained for correlation between codon usage preference and tRNA abundance (Ikemura 1981a,b). Also, in the budding yeast Saccharomyces cerevisiae, it has been shown that codon usage bias is correlated with mRNA copy numbers of the gene (Sharp & Cowe 1991; Akashi 2003). Translational efficiency can be directly related to growth rates, and thus it is expected that codon usage has been optimized for growth through evolution in such situations.
Here we report codon usage in another unicellular eukaryote, the fission yeast Schizosaccharomyces pombe, in which the entire genome sequence has been determined (Wood et al. 2002). The usage of synonymous codons has been reported previously for three highly expressed genes and another 153 genes that are expressed at a lower level (Forsburg 1994). We examined codon usage for genes in the S. pombe genome, and found significant difference in codon usage bias with respect to expression levels of the gene. Thus, we propose that codon usage of the gene can predict its expression levels in S. pombe.
| Results |
|---|
|
|
|---|
Expression levels of 4932 open reading frames (ORFs) in the S. pombe genome were determined based on DNA microarray analysis (Fig. 1A). Among these ORFs, genes for ribosomal proteins showed high levels of expression (Fig. 1B). We first examined the usage of synonymous codons in the ribosomal protein genes, and found a significant difference compared with the genome average (Table 1). The difference can be seen clearly in two-codon amino acids (Lys, Asn, Phe, His, Glu, Cys, Tyr, Asp and Gln) where the usage bias of two codons for ribosomal proteins was opposite to that of total proteins (Fig. 2). In contrast, codon usage in meiotic proteins was similar to that in total proteins (Table 1; Fig. 2). The use of codons with G or C at the third position approximately corresponds to the GC content in the S. pombe genome (38%) in total proteins and in meiotic proteins, but significantly higher in ribosomal proteins, with an exception of Gln (Fig. 2).
|
|
|
To examine whether the codon usage bias is related to levels of gene expression or not, we compared codon usage in selected genes at seven levels of expression. We found that the codon usage varied according to level of gene expression (Table 2, Figs 3–5), and that codon usage in ribosomal proteins was characteristic of that in highly-expressed genes.
|
|
|
|
Codon usage profiles in other organisms
The codon usage of ribosomal protein genes was compared to the genomic average in Saccharomyces cerevisiae, Arabidopsis thaliana, Caenorhabditis elegans, Oryza sativa, Drosophila melanogaster, Takifugu rubripes and Homo sapiens (Supporting Table S1). Codon usage of the ribosomal proteins in Saccharomyces cerevisiae, A. thaliana and C. elegans was significantly different to the genomic average, which is especially clear in two-codon amino acids. In contrast, no significant difference in codon usage was observed between the ribosomal protein genes and the genomic average in O. sativa, D. melanogaster, T. rubripes and H. sapiens (Fig. 6). Codon usage of two-codon amino acids in ribosomal proteins was plotted against that in total proteins. Plots were lined along the orthogonal line for O. sativa, D. melanogaster, T. rubripes and H. sapiens (Fig. 6A), indicating that these two groups of proteins have a similar usage of codons. In contrast, in Saccharomyces cerevisiae, A. thaliana and C. elegans, plots showed a scattered distribution from the orthogonal line (Fig. 6B), indicating that codon usage of ribosomal proteins differs from that of total proteins as in S. pombe. In contrast, a recent report shows that synonymous codon usage bias is only moderately correlated with expression levels of the gene in A. thaliana (dos Reis & Wernisch 2009). This apparent discrepancy may be because our results are based on only ribosomal protein genes. We examined ribosomal protein genes as representatives of highly-expressed genes to avoid possible complexity associated with tissue-specific expression levels in multi-cellular organisms.
|
| Discussion |
|---|
|
|
|---|
From these results, we formulated a set of coefficients for each codon to determine a type of codon usage of a gene of interest (Table 3). This formula calculates codon usage score which provides a criterion for judgment of two types of codon usage: high or low expression type (above or below 32 copies of mRNA, respectively) (see Experimental procedures). It should be mentioned that the formula does not predict expression levels of the gene, but instead predicts the types of codon usage. Examples for two S. pombe genes, cdc2 and act1, are shown in Table 3. The positive and negative values represent high- and low-expression types of codon usage, respectively.
|
Mutations causing amino acid substitution can directly affect functions of the protein, and thus have been effectively selected through evolution. In contrast, mutations only changing codon usage may be less effectively selected through evolution. However, in unicellular organisms such as yeasts, translational efficiency can be directly related to growth rates, and thus optimization of codon usage is expected to occur as a consequence of evolutional pressures.
| Experimental procedures |
|---|
|
|
|---|
Codon usages in the genome of the species examined were obtained from the web site "Codon Usage Database", http://www.kazusa.or.jp/codon/ (Nakamura et al. 2000). Genes for ribosomal proteins were obtained from the web site "Ribosomal Protein Gene Database", http://ribosome.med.miyazaki-u.ac.jp/ (Nakao et al. 2004).
Analysis of codon usage and gene expression in Schizosaccharomyces pombe
Levels of gene expression in S. pombe were determined using a DNA microarray containing 4932 ORFs (Chikashige et al. 2007). Relative amounts of mRNA for each ORF were measured by DNA microarray using genomic DNA as a reference, and the copy number of mRNA was calculated using an estimate of the total mRNA number in the cell as 100 000 copies. The entire DNA microarray data were deposited in gene expression omnibus (GEO, <http://www.ncbi.nlm.nih.gov/geo/index.cgi>; accession number GSE13554). Expression levels of the ORFs are available through the web site: <http://www2.nict.go.jp/w/w103/w131103/CellMagic/index.html>.
Thirty ORFs with the highest expression levels express an average of 256 copies of mRNA, and contain a total of 7326 codons. Other groups of ORFs that express an average of 128, 64, 32, 16, 8 and 4 copies of mRNA were selected such that each group contained at least 7000 codons or 20 ORFs (7263 codons of 34 ORFs, 7291 codons of 20 ORFs, 7234 codons of 20 ORFs, 12 182 codons of 20 ORFs, 10 964 codons of 20 ORFs, and 10 075 codons of 20 ORFs, respectively). For S. pombe ribosomal proteins, 141 ORFs containing 233 320 codons were selected. For S. pombe meiotic proteins, 18 ORFs containing 9039 codons were selected. These genes selected are listed in Supporting Table S2.
Analysis of codon usage in other organisms
Genes for ribosomal proteins were obtained in "Ribosomal Protein Gene Database", <http://ribosome.med.miyazaki-u.ac.jp/>, and codon usage of these genes was examined by the Countcodon program in "Codon Usage Database", <http://www.kazusa.or.jp/codon/>.
Formulation for codon usage score
Codon usage score (S) was defined as an average of expression coefficients over all codons contained in the gene of interest:
|
| (1) |
N is the number of codons in the gene, Ec is an expression coefficient calculated for each triplet codon as listed in Table 3, and the suffix r represents the array number of triplet codons from 1 to N. The expression coefficients were obtained by subtracting the codon usage of 32 copies of mRNA from that of 256 copies of mRNA for each triplet codon, and were normalized to make the SD32 value equal to unity where the SD32 is a standard deviation of the S-value for the group of genes expressing 32 copies of mRNA. Positive and negative signs of codon usage score indicate the high-expression type (above 32 copies) and low-expression type (below 32 copies) of codon usage, respectively; its absolute value indicates a fold multiple of SD32, which provides a measure for the likelihood of the prediction.
| Acknowledgements |
|---|
| Footnotes |
|---|
* Correspondence: hiraoka{at}fbs.osaka-u.ac.jp
| References |
|---|
|
|
|---|
Akashi, H. (2003) Translational selection and yeast proteome evolution. Genetics 164, 1291–1303.
Andersson, S.G. & Kurland, C.G. (1990) Codon preferences in free-living microorganisms. Microbiol. Rev. 54, 198–210.
Chikashige, Y., Tsutsumi, C., Okamasa, K., Yamane, M., Nakayama, J., Niwa, O., Haraguchi, T. & Hiraoka, Y. (2007) Gene expression and distribution of Swi6 in partial aneuploids of the fission yeast Schizosaccharomyces pombe. Cell Struct. Funct. 32, 149–161.[CrossRef][Medline]
Forsburg, S.L. (1994) Codon usage table for Schizosaccharomyces pombe. Yeast 10, 1045–1047.[CrossRef][Medline]
van Hemert, F.J., Berkhout, B. & Lukashov, V.V. (2007) Host-related nucleotide composition and codon usage as driving forces in the recent evolution of the Astroviridae. Virology 361, 447–454.[CrossRef][Medline]
Ikemura, T. (1981a) Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes. J. Mol. Biol. 146, 1–21.[CrossRef][Medline]
Ikemura, T. (1981b) Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system. J. Mol. Biol. 151, 389–409.[CrossRef][Medline]
Ikemura, T. (1985) Codon usage and tRNA content in unicellular and multicellular organisms. Mol. Biol. Evol. 2, 13–34.[Abstract]
Kanaya, S., Yamada, Y., Kinouchi, M., Kudo, Y. & Ikemura, T. (2001) Codon usage and tRNA genes in eukaryotes: correlation of codon usage diversity with translation efficiency and with CG-dinucleotide usage as assessed by multivariate analysis. J. Mol. Evol. 53, 290–298.[CrossRef][Medline]
Meintjes, P.L. & Rodrigo, A.G. (2005) Evolution of relative synonymous codon usage in Human Immunodeficiency Virus type-1. J. Bioinform. Comput. Biol. 3, 157–168.[CrossRef][Medline]
Nakamura, Y., Gojobori, T. & Ikemura, T. (2000) Codon usage tabulated from international DNA sequence databases: status for the year 2000. Nucleic Acids Res. 28, 292.
Nakao, A., Yoshihama, M. & Kenmochi, N. (2004) RPG: the Ribosomal Protein Gene database. Nucleic Acids Res. 32, D168–D170.
Plotkin, J.B. & Dushoff, J. (2003) Codon bias and frequency-dependent selection on the hemagglutinin epitopes of influenza A virus. Proc. Natl. Acad. Sci. USA 100, 7152–7157.
dos Reis, M. & Wernisch, L. (2009) Estimating translational selection in Eukaryotic genomes. Mol. Biol. Evol. 26, 451–461.
Ren, L., Gao, G., Zhao, D., Ding, M., Luo, J. & Deng, H. (2007) Developmental stage related patterns of codon usage and genomic GC content: searching for evolutionary fingerprints with models of stem cell differentiation. Genome. Biol. 8, R35.[CrossRef][Medline]
Sharp, P.M. & Cowe, E. (1991) Synonymous codon usage in Saccharomyces cerevisiae. Yeast 7, 657–658.[CrossRef][Medline]
Sharp, P.M., Stenico, M., Peden, J.F. & Lloyd, A.T. (1993) Codon usage: mutational bias, translational selection, or both? Biochem. Soc. Trans. 21, 835–841.[Medline]
Wood, V., Gwilliam, R., Rajandream, M.A. et al. (2002) The genome sequence of Schizosaccharomyces pombe. Nature 415, 871–880.[CrossRef][Medline]
Accepted: 15 January 2009
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | ADVANCED SEARCH | TABLE OF CONTENTS |