|
|
||||||||
1 Kobe Advanced ICT Research Center, National Institute of Information and Communications Technology, Kobe 651-2492, Japan
2 Graduate School of Frontier Biosciences, Osaka University, 1-3 Yamadaoka, Suita 565-0871, Japan
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
Towards this end, GFP-fusion libraries have been constructed for several organisms. In the days before whole genome sequences were available, large-scale GFP-fusion libraries were constructed by fusing random fragments of genomic DNA or cDNA, and screened by microscopic observation (Niedenthal et al. 1996; Sawin & Nurse 1996; Ding et al. 2000; Simpson et al. 2000). Today, complete genome sequences have been determined for many organisms, thereby allowing the application of high-throughput approaches to the study of molecular dynamics on a genome-wide basis.
For the budding yeast Saccharomyces cerevisiae, after several approaches using epitope-tagging and immunolocalization had been taken, a C-terminal GFP-fusion library of 4156 proteins which covered 75% of its proteome was constructed and analyzed (Ross-Macdonald et al. 1999; Kumar et al. 2002; Huh et al. 2003). A previous large-scale analysis in the fission yeast Schizosaccharomyces pombe involved the construction of a plasmid library in which random genomic DNA fragments were cloned into a multicopy plasmid to express partial ORFs fused with GFP under the control of the authentic promoter (Ding et al. 2000). A disadvantage of that library was that using truncated partial ORFs can affect their localization. Later, a yellow fluorescent protein (YFP) library was constructed in which full-length coding sequences were fused with the YFP coding sequence at their 3'-end and integrated into an ectopic locus (leu1) on the chromosome to express an YFP-fusion protein under control of the nmt1 promoter, in addition to expressing the respective native gene. This YFP library covered almost 90% of the proteome (Matsuyama et al. 2006) and, as the fusion construct is expressed from the inducible nmt1 promoter, allowed intracellular localization of normally unexpressed gene products. At the same time, however, the nmt1 promoter can lead to over-expression or non-physiological expression of the fusion proteins, which might result in mislocalization.
To complement these existing S. pombe libraries, we constructed a library in which GFP-HA coding sequence was integrated at the 3'-end of selected genes on the chromosome to allow expression of full-length GFP-fusions under the control of their original promoters. We selected 1630 genes for tagging with GFP: these genes were biased towards nuclear structure and cell cycle control. Of the selected 1630 genes, 1058 strains were obtained in which the GFP gene is fused to the chromosomal gene: chromosomal integration was confirmed by PCR. Sufficient levels of GFP signals were detected in 710 strains. These 710 strains were classified into eight categories based on the localization of their GFP fusion products: nucleus, nuclear dots, nucleolus, nuclear membranes, cytoplasmic membranes, microtubule/spindle pole body (SPB), cytoplasm and cytoplasmic structures. We further quantified the abundance of the nuclear proteins and compared this with the expression levels of their respective mRNA. In summary, our approach has provided useful complementary information regarding cellular protein localization and abundance, as well as a large collection of GFP-tagged strains.
| Results |
|---|
|
|
|---|
We selected 1630 genes, focusing on those believed to be related to nuclear structures and cell cycle regulators, for GFP tagging. Genes, both characterized and uncharacterized, were selected from three sources. One hundred and fifty six genes were selected from a truncated GFP library we previously constructed (Ding et al. 2000); 925 genes were selected based on predicted function and localization from the fission yeast genome project (S. pombe Gene DB: <http://www.genedb.org/genedb/pombe/index.jsp>); and, in addition, 37 meiosis-specific genes and 74 genes with coiled-coil domains were selected. We also selected 437 genes based on localization data from a fission yeast YFP-fusion library (Matsuyama et al. 2006).
To obtain high efficiency integration of the GFP tag, we employed a two-step PCR method which produced a marker cassette with long flanking homologous regions for integration into the genome (Wach 1996) (Fig. 1). We synthesized sets of four specific oligonucleotide primers for 1630 ORFs (F1-R1 and F2-R2 pairs of primers, Fig. 1A) to make the GFP-tagged fusion library. In the first PCR step, using the F1-R1 and F2-R2 pairs of primers and genomic DNA as a template, two PCR products of approximately 250 bp that share homology before and after the stop codon of an ORF were produced. These two PCR products were then used as primers for the second PCR, with plasmid pAH90 as the template, which contains the GFP-HA sequence and the kanamycin resistance marker gene (see Experimental procedures). The second PCR step produced a 3.1 kbp cassette containing the GFP-HA and the selection maker gene sandwiched between the two 250 bp genomic DNA fragments (Fig. 1B). After purification, the second PCR products were transformed into an S. pombe h90 strain (AY160-14D) (Fig. 1C). Colonies grown on YES-G418 plates were selected for observation. Integration was checked by genomic PCR with a specific primer (primer c in Fig. 1A) and a primer that annealed to the GFP sequence.
|
Efficiency of GFP tagging
We obtained high efficiency of PCR amplification: 1580 of the desired 1630 genomic DNA fragment pairs were amplified in the first PCR step, and 1537 PCR products were then obtained from the second PCR step. Thus, the success rate of the PCR amplification was 94% (Table 1). From the transformation we obtained 1468 clones of which 1058 clones possessed the GFP-HA cassette integrated at the desired locations as confirmed by genomic PCR. In total, the efficiency of the library generation was 65% (Table 1).
|
Localization categories
We observed each of the transformants by fluorescence microscopy and categorized them with the localization of their GFP signals. In 710 strains, GFP signals with specific subcellular localizations were observed (Fig. 2). In the other 348 strains, only weak or background level GFP signals could be observed. Expression of these tagged proteins under the conditions used for screening was probably too low to detect their localization by fluorescence microscopy.
|
Table 2 summarizes the number of proteins in each category. As we intended, this library is biased toward nuclear proteins. Combining all of the proteins related to nuclear organization, we obtained 473 proteins, which account for approximately 10% of the proteome of S. pombe. Detailed information is available from our website <http://www-karc.nict.go.jp/w131103/CellMagic/GFP-lib-New/indexGFP.html>.
|
We designed the library such that the coding sequence of GFP-HA is integrated at the 3'-end of the chromosomal ORF and the full-length GFP fusion is expressed under the control of its own promoter. This construction allows us to detect the abundance of the various target proteins using the same epitope-antibody interaction and quantitative immunoblot analysis. We thus quantified the protein abundances of 449 nuclear proteins in vegetatively growing cells. Figure 3A shows an image of an immunoblot. The bands were quantified using an Image-Pro analyzer; the amount of tagged protein in 100 µg of cell extract and that per cell were calculated from the intensity of standard proteins loaded on the same gel (see Experimental procedures; Fig. 3A,B). The number of protein molecules per mRNA molecule was calculated using previously reported expression levels for each ORF (accession number GSE13554, <http://www.ncbi.nlm.nih.gov/geo/index.cgi>) (Fig. 3C). All the data obtained are shown in Supporting Table S1.
|
Combining the mRNA expression data with the protein abundance data identified in this study, the distribution of protein molecules per mRNA was calculated (Fig. 3C). Whereas the number of protein molecules per mRNA ranged from 1 to 7000, 86% were below 1000 protein molecules per mRNA. There was a single peak at approximately 180–320 protein molecules per mRNA and the average abundance was approximately 237 protein molecules: this value is within the range of the highest bin. This means that the number of protein molecules correlates with the abundance of mRNA in a cell.
Relationship between protein abundance, mRNA expression and localization
We next compared the number of protein molecules and mRNA molecules per cell for the four sub-nuclear categories (Fig. 4). In three of the four categories, the nucleus (NU), nuclear dots (ND) and nuclear membranes (NM) categories, we found a similar distribution pattern of protein abundance versus mRNA. In contrast, the nucleolus (NO) category showed higher levels of mRNA expression and protein abundance (Fig. 4). The average number of mRNA transcripts in the nucleolus category was 27.3 per cell. In other categories, the averages were 14.5 (NU), 11.1 (ND) and 13.6 (NM). The average number of protein molecules in each category were 4824.7 (NU), 3500 (ND), 3874.2 (NM) and 7505.8 (NO) per cell. Thus, both protein and mRNA in the nucleolus category were present at higher levels than in the other categories. Most of the proteins that were localized to the nucleolus are involved in RNA metabolism. This result suggests that proteins that have a function in RNA metabolism are more abundant in a cell. Interestingly, the distribution of the amount of protein is less broad than that of the mRNA. This tendency is significant in the nucleolus category, with protein molecules kept at a relatively constant level regardless of the level of mRNA expression (Fig. 4).
|
| Discussion |
|---|
|
|
|---|
We generated 1058 strains with chromosomal GFP-HA tags including 473 nuclear localization proteins and including characterized and uncharacterized proteins. Proteins localized in the SPB, microtubules and other cytoplasmic structures were also included in this library. This is the first large-scale chromosomally-tagged GFP library in S. pombe. In these strains, the GFP-HA coding sequence is inserted at the 3'-end of the target ORF and thus the fusion protein is expressed under the control of the native promoter. One of the biggest advantages of this library is that the tagged full-length protein is expressed under physiological conditions during mitosis and meiosis as well as during the stress response and so on. Because the native gene is replaced by the fusion gene, it is ensured that the fusion protein is functional if the gene is essential for cell viability. Furthermore, the GFP-HA tag makes biochemical analysis applicable. Although this collection is biased and does not cover the entire proteome, it will provide a useful tool for analysis of nuclear architecture and dynamics.
In the budding yeast S. cerevisiae, homologous sequences of 25–60 bp are sufficient for chromosomal integration because of the high frequency of homologous recombination (Hayden & Byers 1992). In S. pombe, however, long homologous DNA fragments are required to integrate the GFP tag into the chromosome as the frequency of homologous recombination is relatively low (Bähler et al. 1998). In this study, we used two 250 bp homologous DNA sequences sandwiching a GFP-HA tagging cassette to integrate the tag into the chromosome. Even with these long homologous sequences, successful integration only occurred in 72% of the transformants (Table 1), regardless of whether the target gene was essential or non-essential. Thus, to achieve a higher success rate of integration, more than 250 bp of homologous sequence may be required.
In our GFP-tagged library, 348 strains out of the 1058 PCR positive transformants did not show apparent GFP signals in any compartments of the cell. There are three possible reasons for this. First, expression of the GFP-fusion protein might have been below the detection limit of the fluorescent microscope under the conditions used for screening. Some of those proteins may be expressed under certain other conditions. Second, C-terminal tagging might disrupt proper localization. Third, technical problems such as PCR errors might have occurred during construction of the integration cassette. For these genes, N-terminal tagging, expression under a strong promoter or the use of a multicopy plasmid might provide alternative ways of localizing their products.
Relationship between protein abundance and mRNA expression in the nuclear organization-related categories
Taking advantage of the chromosome tags in our library, we determined the number of protein molecules in a cell by immunoblot analysis and then compared the abundance of the nuclear proteins to mRNA levels (Figs 3 and 4). It has been reported for S. cerevisiae that mRNA levels of functionally-related genes are similar to each other (Ihmels et al. 2002), and so are those of related cellular localization (Drawid et al. 2000). In S. pombe, we found similar distribution patterns of protein abundance and mRNA expression in the nucleus-related categories. However, in the NO category, levels of both protein and mRNA were higher than for other categories (Fig. 4). In the nucleus-related gene categories, with the exception of the NO category, protein abundance correlated with mRNA expression. A similar correlation was previously reported for S. cerevisiae (Ghaemmaghami et al. 2003). The NO category, however, showed a slightly different pattern: most of the NO proteins were detected at high levels (approximately 1.0 x 104 molecules per cell) despite their mRNA being present at a wide range of levels (Fig. 4). This might be an indication of protein stability in the NO category because protein levels are kept high regardless of mRNA levels. The high level of expression of NO category proteins may be required to ensure that processes involved in cell growth are able to be carried out under a variety of environmental conditions which might affect levels of mRNA expression.
| Experimental procedures |
|---|
|
|
|---|
AY160-14D (h90 ade6-216, leu1, lys1, ura4) was used as the host strain of this library. All media used in this study are described in Moreno et al. (1991).
Library construction
Chromosomal GFP tags were introduced at the carboxyl-terminus of ORFs using a two-step method described in Wach (1996). Two pairs of specific primers, forward 1 (F1) with reverse 1 (R1) and forward 2 (F2) with reverse 2 (R2) (Fig. 1A), were designed for each ORF using a Perl program to select primers with appropriate melting temperatures. We first defined the positions of the stop codons (marked as "X" in Fig. 1A) in all ORFs, which were obtained from the Sanger Institute website <http://www.sanger.ac.uk/Projects/S_pombe/access.shtml>. Next, optimum primer sequences were selected in regions approximately 250–300 bp before and after the position X to achieve a GC content of between 33% and 58%. R1 primers were 44 bp in length, designed to contain 20 bp of the chromosomal sequence (positions from X –3 to X –22) and 24 bp of the amino terminus of the GFP gene. F2 primers were 44 bp in length, designed to contain 20 bp selected from the chromosomal sequence (positions from X +3 to X +50) and 24 bp sequence from the terminator of the kanamycin resistance gene (kanr) in the pAH90 plasmid. Synthesized primers were diluted to 2.5 mM in TE in 96-well plates. A list of all of the primers is shown in Supporting Table S1. Plasmid pAH90 was constructed by subcloning a fragment of the GFP-S65T gene obtained from pGFP3-2 (Ding et al. 2000) into the PacI site of the pFA6a-3xHA-kanMX6 plasmid (Bähler et al. 1998). Sequence encoding three amino acids (Leu-Gly-Ser) was integrated in-frame between sequence encoding the C-terminus of the target protein and the GFP gene. In the first PCR, 250–300 bp fragments were amplified from 10 ng genomic DNA using the F1-R1 and the F2-R2 primer pairs (final concentration of 5 mM) in a total volume of 50 µL. ExTaq DNA polymerase (TaKaRa) was used for all PCR procedures. Genomic DNA was prepared from isolated nuclei as described in Matsumoto et al. (1987). The second PCR used pAH90 as a template with the two PCR products from the first PCR (2 µL from the first PCR reaction) and 10 mM of the F1 and R2 primers. PCR reaction mixtures were prepared in 96-well plates by an automatic pipetting apparatus (Beckman Coulter, BioMek 2000). The products of the second PCR were purified using a 96-well purification filter (Nippon Genesis). All PCR products were assayed by electrophoresis on pre-made gels (Ready-To-Run, Amersham Pharmacia) to confirm the length and the amount. The products from the second PCR were then transformed into an S. pombe h90 strain (AY160-14D) using a lithium-acetate method. Kanamycin-resistant transformant colonies were selected on YES-G418 plates (YES+200 mg/L G418). Insertion of GFP at the proper chromosomal locus was confirmed by PCR with a GFP-R3 primer (5'-TGT GGA CAG GTA ATG GTT GTC-3') and ORF-specific primer (primer c, Fig. 1A). Two to ten colonies for each transformant were observed with the aid of a DeltaVision microscope system to localize GFP signals in the cells. This library of GFP-tagged S. pombe strains is available from the National BioResource Project on Yeast <http://yeast.lab.nig.ac.jp/nig/index_en.html>.
Fluorescence microscopy
Live cell imaging was carried out as described in Ding et al. (2000) with some modifications. Cells were cultured on a YES-G418 plate at 26 °C. For vegetative cells, cells were incubated in a YES liquid medium at 26 °C to an early log phase. The cells were suspended in EMM2 medium, and mounted on a 10-hole glass slide (Polysciences Inc.) for microscopic observation. For meiotic cells, cells were cultured on a ME plate at 26 °C for 16 h to induce meiosis. The cells were suspended in EMM2-N medium, and mounted on a 10-hole glass slide for microscopic observation. A DeltaVision microscope system (Applied procession Inc.) was used for image data acquisition and analysis (Haraguchi et al. 1999).
Protein quantification
Cells were grown in 10 mL of YES medium to mid-log phase at 30 °C. After washing with 1x Laemmli buffer (1% SDS, 10% glycerol, 0.06 M Tris–HCl (pH 6.6)), the cells were suspended in 60 µL of 1x Laemmli buffer and boiled. After being frozen in liquid nitrogen, the cells were disrupted with 0.5 mm glass beads using a Multibeads Shocker (Yasui Kikai). The protein concentration of the cell extracts was determined by measuring absorbance (OD655), and 100 µg samples of the cell extracts were separated on 28-well 10% SDS-PAGE gels. Every 24 cell extracts were grouped in descending order of their mitotic phase expression levels. The mRNA data used in this study was taken from Chikashige et al. (2007). One, five and 10 ng of protein standard, GFP-3HA and Mep33-GFP-3HA, were loaded on each gel. The GFP-3HA gene of pAH90 was subcloned into the pRSET-A plasmid (pAH92). A fragment of the Mep33 GFP-3HA gene was obtained by PCR using genomic DNA prepared from a Mep33 GFP-3HA tagged strain and cloned into the pRSET-A plasmid (pAH93). GFP-3HA and Mep33-GFP-3HA proteins were over-expressed in Escherichia coli and purified using Ni2+ columns (ProBond, Invitrogen). Serial dilutions of the purified proteins were mixed with 100 µg cell extracts and loaded on the gels as standards for estimation of protein abundance. The gels were run at 20 mA for 100 min and transferred using a semi-dry blotter (Continental Lab Products) onto PVDF membrane (immobilon-P, Millipore) at a constant current of 70 mA per gel for 60 min. Tagged proteins in the cell extracts were detected using 3F10 rat monoclonal anti-HA antibody (Roche). The membranes were probed with mouse anti-rat IgG conjugated HRP antibody as the secondary antibody (Cappel). All procedures were essentially carried out as described previously (Hayashi et al. 2006). X-ray film was exposed to ECL-treated membranes for 5, 10 and 20 min. Quantitative immunoblot analysis was carried out using a Gel-Pro analyzer (MediaCybernetics). The integrated intensities of bands corresponding to the tagged proteins were measured and converted to molecular numbers per cell using the intensity of the protein standards (Ghaemmaghami et al. 2003). We defined the weight of a single cell as 10 pg (Sanger Institute, Fission Yeast Handbook), and the total mRNA number of a cell as 1 x 105 (Chikashige et al. 2007) for calculation of protein molecules per cell and per mRNA molecule. Protein molecule per cell = (intensity of immunoblot band/intensity of standard protein)/(cell number per lane) x Avogadro's constant (6.02 x 1023)/molecular weight of standard protein.
Amounts of protein measured by immunoblotting can be underestimated for proteins larger than 100 kDa because of lower efficiency of transfer to the membrane. To estimate this effect, we plotted measured amounts of protein as a function of the protein size, together with a plot of measured amounts of mRNA as a control, which is independent of the protein size (Supporting Fig. S1). Results showed that protein amounts were comparable to mRNA amounts, and were not significantly reduced for large proteins.
| Acknowledgements |
|---|
| Footnotes |
|---|
aPresent address: RIKEN, Kobe, Center for Developmental Biology, 2-2-3, Minatozima-minami, Kobe 650-0047, Japan.
| References |
|---|
|
|
|---|
Chikashige, Y., Tsutsumi, C., Okamasa, K., Yamane, M., Nakayama, J., Niwa, O., Haraguchi, T. & Hiraoka, Y. (2007) Gene expression and distribution of Swi6 in partial aneuploids of the fission yeast Schizosaccharomyces pombe. Cell Struct. Funct. 32, 149–161.[CrossRef][Medline]
Ding, D.Q., Tomita, Y., Yamamoto, A., Chikashige, Y., Haraguchi, T. & Hiraoka, Y. (2000) Large-scale screening of intracellular protein localization in living fission yeast cells by the use of a GFP-fusion genomic DNA library. Genes Cells 5, 169–190.[Abstract]
Drawid, A., Jansen, R. & Gerstein, M. (2000) Genome-wide analysis relating expression level with protein subcellular localization. Trends Genet. 16, 426–430.[CrossRef][Medline]
Ghaemmaghami, S., Huh, W.K., Bower, K., Howson, R.W., Belle, A., Dephoure, N., O'Shea, E.K. & Weissman, J.S. (2003) Global analysis of protein expression in yeast. Nature 425, 737–741.[CrossRef][Medline]
Haraguchi, T., Ding, D.Q., Yamamoto, A., Kaneda, T., Koujin, T. & Hiraoka, Y. (1999) Multiple-color fluorescence imaging of chromosomes and microtubules in living cells. Cell Struct. Funct. 24, 291–298.[CrossRef][Medline]
Hayashi, A., Asakawa, H., Haraguchi, T. & Hiraoka, Y. (2006) Reconstruction of the kinetochore during meiosis in fission yeast Schizosaccharomyces pombe. Mol. Biol. Cell 17, 5173–5184.
Hayden, M.S., & Byers, B. (1992) Minimal extent of homology required for completion of meiotic recombination in Saccharomyces cerevisiae. Dev. Genet. 13, 498–514.[CrossRef][Medline]
Huh, W.K., Falvo, J.V., Gerke, L.C., Carroll, A.S., Howson, R.W., Weissman, J.S. & O'Shea, E.K. (2003) Global analysis of protein localization in budding yeast. Nature 425, 686–691.[CrossRef][Medline]
Ihmels, J., Friedlander, G., Bergmann, S., Sarig, O., Ziv, Y. & Barkai, N. (2002) Revealing modular organization in the yeast transcriptional network. Nat. Genet. 31, 370–377.[CrossRef][Medline]
Kumar, A., Agarwal, S., Heyman, J.A., Matson, S., Heidtman, M., Piccirillo, S., Umansky, L., Drawid, A., Jansen, R., Liu, Y., Cheung, K.H., Miller, P., Gerstein, M., Roeder, G.S. & Snyder, M. (2002) Subcellular localization of the yeast proteome. Genes Dev. 16, 707–719.
Matsumoto, S., Yanagida, M. & Nurse, P. (1987) Histone transcription in cell cycle mutants of fission yeast. EMBO J. 6, 1093–1097.[Medline]
Matsuyama, A., Arai, R., Yashiroda, Y., Shirai, A., Kamata, A., Sekido, S., Kobayashi, Y., Hashimoto, A., Hamamoto, M., Hiraoka, Y., Horinouchi, S. & Yoshida, M. (2006) ORFeome cloning and global analysis of protein localization in the fission yeast Schizosaccharomyces pombe. Nat. Biotechnol. 24, 841–847.[CrossRef][Medline]
Moreno, S., Klar, A. & Nurse, P. (1991) Molecular genetic analysis of fission yeast Schizosaccharomyces pombe. Methods Enzymol. 194, 795–823.[Medline]
Niedenthal, R.K., Riles, L., Johnston, M. & Hegemann, J.H. (1996) Green fluorescent protein as a marker for gene expression and subcellular localization in budding yeast. Yeast 12, 773–786.[CrossRef][Medline]
Ross-Macdonald, P., Coelho, P.S., Roemer, T. et al. (1999) Large-scale analysis of the yeast genome by transposon tagging and gene disruption. Nature 402, 413–418.[CrossRef][Medline]
Sawin, K.E. & Nurse, P. (1996) Identification of fission yeast nuclear markers using random polypeptide fusions with green fluorescent protein. Proc. Natl Acad. Sci. USA 93, 15146–15151.
Simpson, J.C., Wellenreuther, R., Poustka, A., Pepperkok, R. & Wiemann, S. (2000) Systematic subcellular localization of novel proteins identified by large-scale cDNA sequencing. EMBO Rep. 1, 287–292.[CrossRef][Medline]
Wach, A. (1996) PCR-synthesis of marker cassettes with long flanking homology regions for gene disruptions in S. cerevisiae. Yeast 12, 259–265.[CrossRef][Medline]
Received: 29 September 2008
Accepted: 9 November 2008
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | ADVANCED SEARCH | TABLE OF CONTENTS |