Can different proteins be produced during translation of a single mRNA in eukaryotes?

Can different proteins be produced during translation of a single mRNA in eukaryotes?

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Is there a translational mechanism that eukaryotes can use to produce different proteins from a single transcribed mRNA?

There are multiple mechanisms that are known to lead to translation of substantially (or entirely) different proteins from a single mRNA. While these mechanisms are more typically seen in viruses, I'm focusing on examples documented within the endogenous transcriptome of eukaryotes.

Alternative translation initiation

One process that can lead to different proteins being translated from the same mRNA in eukaryotes is the use of alternative translation initiation sites.1,2 Translation typically starts with a pre-initiation complex recognizing the 5' cap and loading onto the mRNA. This complex scans until it finds an appropriate start site. The choice of start site depends on the how well the ribonucleotide sequence matches the Kozak consensus sequence.3 If the region around the first AUG is not a good match for that consensus then a process known as leaky scanning occurs and the pre-initiation complex can continue along the mRNA until a "good" start site is found.

While this can result in proteins with different functions, they will typically still have large regions of amino acid sequence in common. One example of this is in a kinase known as MK2, which is "a key regulator of transcription, migration, death signaling and post-transcriptional gene regulation".4

Polycistronic mRNAs

While more common in prokaryotes, in some cases eukaryotes also have polycistronic transcripts5,6. These transcripts encode multiple separate proteins (i.e. from independent open reading frames). The examples that I have found for mammals are all bicistronic (operons with two genes): LASS1-GDF1, SNRPN-SNURF, MTPN-LUZP6 and MFRP-C1QTNF5. You can search for those gene pairs, but there doesn't seem to be a huge amount of information available and in many cases one of the genes is almost completely uncharacterized. Eukaryotic operons (aka polycistronic mRNAs) are ubiquitous in trypanosomes, appear to be very common in nematodes (round worms), and are also frequently seen in Drosophila (a fly).6

Translational read-through

Translational readthrough aka. stop codon suppression occurs when an in frame stop codon is ignored either stochastically or under specific conditions and translation continues beyond that point. This leads to a C-terminal extension of the protein which in some cases has been shown to have functional implications7,8. An example of this is the [PSI+] prion in yeast, which promotes translational readthrough throughout the yeast transcriptome by inactivating a factor involved in translation termination.7

Translational frameshifts

This is well covered in the answer by @Dirigible.


1: Kochetov, A. V. (2008). Alternative translation start sites and hidden coding potential of eukaryotic mRNAs. Bioessays, 30(7), 683-691.

2: Wan, J., & Qian, S. B. (2013). TISdb: a database for alternative translation initiation in mammalian cells. Nucleic acids research, 42(D1), D845-D850.

3: Acevedo, J. M., Hoermann, B., Schlimbach, T., & Teleman, A. A. (2018). Changes in global translation elongation or initiation rates shape the proteome via the Kozak sequence. Scientific reports, 8(1), 4018. 4: Trulley, P., Snieckute, G., Bekker-Jensen, D., Menon, M. B., Freund, R., Kotlyarov, A.,… & Gaestel, M. (2019). Alternative Translation Initiation Generates a Functionally Distinct Isoform of the Stress-Activated Protein Kinase MK2. Cell reports, 27(10), 2859-2870.

5: Tautz, D. (2008). Polycistronic peptide coding genes in eukaryotes-how widespread are they?. Briefings in Functional Genomics and Proteomics, 8(1), 68-74.

6: Blumenthal, T. (2004). Operons in eukaryotes. Briefings in Functional Genomics, 3(3), 199-211.

7: Schueren, F., & Thoms, S. (2016). Functional translational readthrough: a systems biology perspective. PLoS genetics, 12(8), e1006196.

8: Loughran, G., Jungreis, I., Tzani, I., Power, M., Dmitriev, R. I., Ivanov, I. P.,… & Atkins, J. F. (2018). Stop codon readthrough generates a C-terminally extended variant of the human vitamin D receptor with reduced calcitriol response. Journal of Biological Chemistry, 293(12), 4434-4444.

Separate from the alternative translation start site mechanism that tyersome describes is programmed translational frameshifting. This is independent of alternative splicing or post-translational modifications, and happens when a ribosome switches reading frames while a protein is already being translated. Typically, this phenomenon is associated with viral translation, and allows viruses to encode many proteins on relatively short genomes. Take HIV as an example: the polyprotein gag-pol requires efficient −1 frameshifting for expression of the individual gag and pol gene products.

In eukaryotes, examples are more sparse. Check out this review from 2012 --

… examples of mammalian genes that utilize −1 frameshifting are the mouse embryonic carcinoma differentiation regulated (EDR) gene and its human ortholog PEG10. A slippery sequence of G GGA AAC, in combination with a pseudoknot, mediates highly efficient −1 frameshifting, similar to viral frameshifting motifs (Clark et al., 2007). Recently, a programmed ribosomal −1 frameshift has been identified in the adenomatous polyposis coli (APC) mRNA in Caenorhabditis elegans that is mediated by a slippery sequence A AAA AAA or A AAA AAC (Baranov et al., 2011). The functional relevance of this frameshift is uncertain.

Frameshifting events are often mediated by conserved RNA secondary structures, like pseudoknots and stem-loops. For some specific examples of the types of structures involved in frameshifting, see the following publications:

Identification of a New Antizyme mRNA +1 Frameshifting Stimulatory Pseudoknot in a Subset of Diverse Invertebrates and its Apparent Absence in Intermediate Species

Structural probing and mutagenic analysis of the stem-loop required for Escherichia coli dnaX ribosomal frameshifting: programmed efficiency of 50%

−1 Frameshifting at a CGA AAG Hexanucleotide Site Is Required for Transposition of Insertion Sequence IS1222

Some more recent examples of (potential) programmed translational frameshifts in eukaryotes:

Search for potential reading frameshifts in cds from Arabidopsis thaliana and other genomes

Preprint: Extensive programmed ribosomal frameshifting in human as revealed by a massively parallel reporter assay


The central dogma of molecular biology proposes that information flows from DNA to RNA through the process of transcription and from RNA to protein through translation. In eukaryotes, these two processes are thought of as disconnected: nuclear factors control transcription, and a different set of factors control translation in the cytoplasm. During times of stress, cells exhibit large transcriptional changes including the upregulation of many genes important for survival. Concurrent with these large transcriptional changes, cells dramatically decrease overall protein synthesis, thereby dampening the stress response at the level of protein expression. Brian Zid and Erin O’Shea wondered if there may be preferential translation of transcripts under stress. To find out, they studied glucose starvation in yeast, a condition in which translation is rapidly repressed while large transcriptional changes are taking place.

Using ribosome profiling, a technique that measures the number of ribosomes on mRNAs using next-generation sequencing, the authors found that a subset of transcriptionally upregulated mRNAs were preferentially translated during glucose starvation. A conserved phenomenon upon stress is the formation of mRNA-protein aggregates, called stress granules. Using fluorescent microscopy, they found that mRNAs that were preferentially translated remained diffuse throughout the cytoplasm, while transcriptionally upregulated but poorly translated mRNAs were found to aggregate into stress granules.

Surprisingly, the information specifying differential localization and protein production of these two classes of mRNAs is not found directly in the mRNA sequence, but instead is encoded in the promoter sequence driving mRNA production. The authors found that promoter responsiveness to the transcription factor heat shock factor (Hsf1) specifies diffuse cytoplasmic localization and higher protein production upon glucose starvation, whereas promoter elements upstream of poorly translated mRNAs direct these mRNAs to stress granules under glucose starvation.

This alters the current paradigm that transcription and translation are disconnected in eukaryotes instead, the authors find that these spatially distinct processes are coupled during nutrient limitation. A linkage between transcriptional regulation and cytoplasmic localization may be a general adaptation during times of stress, enabling the cell to coordinately regulate the production of entire classes of proteins. Under non-stress conditions, upregulation of a class of transcripts by a transcription factor would produce similar amounts of protein from each of the mRNAs, as translation would proceed at a generally high rate. Under stressful conditions when overall translation is reduced, selective translation may be required to produce proteins needed for adaptation to the new condition. This suggests that the translation of gene sets can be coordinated using a single nuclear factor, without the need to modulate the sequence of each mRNA.

This work, which will appear in the August 7th issue of Nature, offers a new direction for investigating whether promoter-dependent mRNA localization regulates protein expression in a variety of cell states. The O’Shea lab is currently working to identify factors that may be co-transcriptionally loaded onto mRNAs to determine an mRNA’s fate.

Biochemistry. 5th edition.

The basic plan of protein synthesis in eukaryotes and archaea is similar to that in bacteria. The major structural and mechanistic themes recur in all domains of life. However, eukaryotic protein synthesis entails more protein components than does prokaryotic protein synthesis, and some steps are more intricate. Some noteworthy similarities and differences are as follows:

Ribosomes. Eukaryotic ribosomes are larger. They consist of a 60S large subunit and a 40S small subunit, which come together to form an 80S particle having a mass of 4200 kd, compared with 2700 kd for the prokaryotic 70S ribosome. The 40S subunit contains an 18S RNA that is homologous to the prokaryotic 16S RNA. The 60S subunit contains three RNAs: the 5S and 28S RNAs are the counterparts of the prokaryotic 5S and 23S molecules its 5.8S RNA is unique to eukaryotes.

Initiator tRNA. In eukaryotes, the initiating amino acid is methionine rather than N-formylmethionine. However, as in prokaryotes, a special tRNA participates in initiation. This aminoacyl-tRNA is called Met-tRNAi or Met-tRNAf (the subscript “i” stands for initiation, and 𠇏” indicates that it can be formylated in vitro).

Initiation. The initiating codon in eukaryotes is always AUG. Eukaryotes, in contrast with prokaryotes, do not use a specific purine-rich sequence on the 5′ side to distinguish initiator AUGs from internal ones. Instead, the AUG nearest the 5′ end of mRNA is usually selected as the start site. A 40S ribosome attaches to the cap at the 5′ end of eukaryotic mRNA (Section 28.3.1) and searches for an AUG codon by moving step-by-step in the 3′ direction (Figure 29.33). This scanning process in eukaryotic protein synthesis is powered by helicases that hydrolyze ATP. Pairing of the anticodon of Met-tRNAi with the AUG codon of mRNA signals that the target has been found. In almost all cases, eukaryotic mRNA has only one start site and hence is the template for a single protein. In contrast, a prokaryotic mRNA can have multiple Shine-Dalgarno sequences and, hence, start sites, and it can serve as a template for the synthesis of several proteins. Eukaryotes utilize many more initiation factors than do prokaryotes, and their interplay is much more intricate. The prefix eIF denotes a eukaryotic initiation factor. For example, eIF-4E is a protein that binds directly to the 7-methylguanosine cap (Section 28.3.1), whereas eIF-4A is a helicase. The difference in initiation mechanism between prokaryotes and eukaryotes is, in part, a consequence of the difference in RNA processing. The 5′ end of mRNA is readily available to ribosomes immediately after transcription in prokaryotes. In contrast, pre-mRNA must be processed and transported to the cytoplasm in eukaryotes before translation is initiated. Thus, there is ample opportunity for the formation of complex secondary structures that must be removed to expose signals in the mature mRNA. The 5′ cap provides an easily recognizable starting point. In addition, the complexity of eukaryotic translation initiation provides another mechanism for gene expression that we shall explore further in Chapter 31.

Elongation and termination. Eukaryotic elongation factors EF1α and EF1βγ are the counterparts of prokaryotic EF-Tu and EF-Ts. The GTP form of EF1α delivers aminoacyl-tRNA to the A site of the ribosome, and EF1βγ catalyzes the exchange of GTP for bound GDP. Eukaryotic EF2 mediates GTP-driven translocation in much the same way as does prokaryotic EF-G. Termination in eukaryotes is carried out by a single release factor, eRF1, compared with two in prokaryotes. Finally, eIF3, like its prokaryotic counterpart IF3, prevents the reassociation of ribosomal subunits in the absence of an initiation complex.

Figure 29.33

Eukaryotic Translation Initiation. In eukaryotes, translation initiation starts with the assembly of a complex on the 5′ cap that includes the 40S subunit and Met-tRNAi. Driven by ATP hydrolysis, this complex scans the mRNA until the first AUG (more. )

Messenger RNA Modifications

Translation occurs in the cytoplasm. After leaving the nucleus, mRNA must undergo several modifications before being translated. Sections of the mRNA that do not code for amino acids, called introns, are removed. A poly-A tail, consisting of several adenine bases, is added to one end of the mRNA, while a guanosine triphosphate cap is added to the other end. These modifications remove unneeded sections and protect the ends of the mRNA molecule. Once all modifications are complete, mRNA is ready for translation.

Characteristics of the mRNA of Prokaryotes and Eukaryotes

mRNA produced through the transcription process is also known as mRNA transcripts. Although they have a number of similar characteristics, they also have several differences. The prokaryotic mRNA transcript can be divided into a number of parts/sections that include: the non-coding region (located at the 5' end of the transcript), the Shine-Dalgarno sequence, a second non-coding region, the start codon, the coding region, stop codon and another non-coding region on the 3' end.

The eukaryotic mRNA, on the other hand, starts with a 5' cap and consists of a guanine nucleotide. This nucleotide is attached to a methyl group and bound to the neighboring nucleotide. The guanine nucleotide is attached to the non-coding region, similar to the one in prokaryotic mRNA. The next section is the start codon from which the coding region extends.

The coding region ends at the stop codon. This is followed by a non-coding region and lastly the poly-A-tail (made up of adenines and may consist of as many as 2200 nucleotides) at the 3' end. In eukaryotes, the 5' cap and the poly-A tail prevent the mRNA from being degraded.

Here, it's important to remember that in eukaryotes, the mRNA has to be released into the cytoplasm where translation takes place. Therefore, the two sections play an important role in maintaining the integrity of the mRNA. In prokaryotes, transcription and translation can occur at the same time and thus these sections are not necessary.

Unlike the eukaryote transcript, this mRNA does not have to be transported a long distance and thus does not encounter various enzymes that are likely to degrade it. As a result, the mRNA in prokaryotes does not require additional protection to prevent damage.

As mentioned, translation is the process through which the building blocks of proteins (polypeptides/amino acid chains) are built using the information contained in the mRNA. It's an important process given that it produces proteins that are required for various cell functions.

In order to understand the process it is important to know some of the components and terminologies used in translation.

Apart from mRNA (messenger RNA), they include:

· Polypeptides - Chains of amino acids and are the molecules that make up proteins.

· Nucleotides - Structural components of DNA and RNA. They are themselves made up of nucleoside and phosphate and include adenine, thymine, cytosine, and guanine (as well as Uracil).

· Codons - A group consisting of three nucleotides - For instance, AUG is a good example of a codon - While codons serve as the building blocks of amino acids, others stop the process once the polypeptide is complete.

· tRNA (transfer RNA) - Act as the bridge between mRNA codons and amino acids.

· Ribosome - Ribosome consist of rRNA, and protein and are the structures in which polypeptides are manufactured.

The Beginning of mRNA Is Not Translated

Interestingly, not all regions of an mRNA molecule correspond to particular amino acids. In particular, there is an area near the 5' end of the molecule that is known as the untranslated region (UTR) or leader sequence. This portion of mRNA is located between the first nucleotide that is transcribed and the start codon (AUG) of the coding region, and it does not affect the sequence of amino acids in a protein (Figure 3).

So, what is the purpose of the UTR? It turns out that the leader sequence is important because it contains a ribosome-binding site. In bacteria , this site is known as the Shine-Dalgarno box (AGGAGG), after scientists John Shine and Lynn Dalgarno, who first characterized it. A similar site in vertebrates was characterized by Marilyn Kozak and is thus known as the Kozak box. In bacterial mRNA, the 5' UTR is normally short in human mRNA, the median length of the 5' UTR is about 170 nucleotides. If the leader is long, it may contain regulatory sequences, including binding sites for proteins, that can affect the stability of the mRNA or the efficiency of its translation.

Additional Processing

Before the mRNA leaves the nucleus, it is given two protective “caps” that prevent the ends of the strand from degrading during its journey. The two ends of a DNA strand are referred to as 3′ and 5′, which references the position of sugar molecules in the DNA. The 5′ cap is placed on the 5′ end of the mRNA. The poly-A tail , which is attached to the 3′ end, is usually composed of a long chain of adenine (A) nucleotides. These changes protect the two ends of the RNA from being broken down by other enzymes in the cell.

A nucleotide sugar is composed of 5 carbons each one is numbered (the apostrophe ” ‘ ” is called “prime”). Phosphates bind to the 3′ carbon and the 5′ carbon of each sugar. One end of a DNA or RNA molecule ends with the 3′ carbon exposed and the other side ends with a phosphate attached to the last sugar’s 5′ end. This is why one end is referred to as 3′ and the other end is referred to as 5’.

Can different proteins be produced during translation of a single mRNA in eukaryotes? - Biology

The central dogma hinges on the existence and properties of an army of mRNA molecules that are transiently brought into existence in the process of transcription and often, shortly thereafter, degraded away. During the short time that they are found in a cell, these mRNAs serve as a template for the creation of a new generation of proteins. The question posed in this vignette is this: On average, what is the ratio of translated message to the message itself?

Though there are many factors that control the protein-mRNA ratio, the simplest model points to an estimate in terms of just a few key rates. To see that, we need to write a simple “rate equation” that tells us how the protein content will change in a very small increment of time. More precisely, we seek the functional dependence between the number of protein copies of a gene (p) and the number of mRNA molecules (m) that engender it. The rate of formation of p is equal to the rate of translation times the number of messages, m, since each mRNA molecule can itself be thought of as a protein source. However, at the same time new proteins are being synthesized, protein degradation is steadily taking proteins out of circulation. Further, the number of proteins being degraded is equal to the rate of degradation times the total number of proteins. These cumbersome words can be much more elegantly encapsulated in an equation which tells us how in a small instant of time the number of proteins changes, namely,

where α is the degradation rate and β is the translation rate (though the literature is unfortunately torn between those who define the notation in this manner and those who use the letters with exactly the opposite meaning).

Figure 1: Ribosomes on mRNA as beads on a string (from:

We are interested in the steady state solution, that is, what happens after a sufficiently long time has passed and the system is no longer changing. In that case dp/dt=0=βm-αp. This tells us in turn that the protein to mRNA ratio is given by p/m = β/α. We note that this is not the same as the number of proteins produced from each mRNA, this value requires us to also know the mRNA turnover rate which we take up at the end of the vignette. What is the value of b ? A rapidly translated mRNA will have ribosomes decorating it like beads on a string as captured in the classic electron micrograph shown in Figure 1. Their distance from one another along the mRNA is at least the size of the physical footprint of a ribosome (≈20 nm, BNID 102320, 105000) which is the length of about 60 base pairs (length of nucleotide ≈0.3 nm, BNID 103777), equivalent to ≈20 aa. The rate of translation is about 20 aa/sec. It thus takes at least one second for a ribosome to move along its own physical size footprint over the mRNA implying a maximal overall translation rate of b=1 s -1 per transcript.

The effective degradation rate arises not only from degradation of proteins but also from a dilution effect as the cell grows. Indeed, of the two effects, often the cell division dilution effect is dominant and hence the overall effective degradation time, which takes into account the dilution, is about the time interval of a cell cycle, τ. We thus have α = 1/τ.

In light of these numbers, the ratio p/m is therefore 1 s -1 /(1/τ)= τ. For E. coli, τ is roughly 1000 s and thus p/m

1000. Of course if mRNA are not transcribed at the maximal rate the ratio will be smaller. Let’s perform a sanity check on this result. Under exponential growth at medium growth rate E. coli is known to contain about 3 million proteins and 3000 mRNA (BNID 100088, 100064). These constants imply that the protein to mRNA ratio is ≈1000, precisely in line with the estimate given above. We can perform a second sanity check based on information from previous vignettes. In the vignette on “What is heavier an mRNA or the protein it codes for?” we derived a mass ratio of about 10:1 for mRNA to the proteins they code for. In the vignette on “What is the macromolecular composition of the cell?” we mentioned that protein is about 50% of the dry mass in E. coli cells while mRNA are only about 5% of the total RNA in the cell which is itself roughly 20% of the dry mass. This implies that mRNA is thus about 1% of the overall dry mass. So the ratio of mRNA to protein should be about 50 times 10, or 500 to 1. From our point of view, all of these sanity checks hold together very nicely.

Figure 2: Simultaneous measurement of mRNA and protein in E. coli. (A) Microscopy images of mRNA level in E. coli cells. (B) Microscopy images of protein in E. coli cells. (C) Protein copy number vs mRNA levels as obtained using both microscopy methods like those shown in part (A) and using sequencing based methods. From Taniguchi et al. Science. 329, 533 (2010).

Experimentally, how are these numbers on protein to mRNA ratios determined? One elegant method is to use fluorescence microscopy to simultaneously observe mRNAs using fluorescence in-situ hybridization (FISH) and their protein products which have been fused to a fluorescent protein. Figure 2 shows microscopy images of both the mRNA and the corresponding translated fusion protein for one particular gene in E. coli. Figure 2C shows results using these methods for multiple genes and confirms a 100- to 1000-fold excess of protein copy numbers over their corresponding mRNAs. As seen in that figure, not only is direct visualization by microscopy useful, but sequence-based methods have been invoked as well.

For slower growing organisms such as yeast or mammalian cells we expect a larger ratio with the caveat that our assumptions about maximal translation rate are becoming ever more tenuous and with that our confidence in the estimate. For yeast under medium to fast growth rates, the number of mRNA was reported to be in the range of 10,000-60,000 per cell (BNID 104312, 102988, 103023, 106226, 106763). As yeast cells are ≈50 times larger in volume than E. coli, the number of proteins can be estimated as larger by that proportion, or 200 million. The ratio p/m is then ≈2吆 8 /2吆 4 ≈10 4 , in line with experimental value of about 5,000 (BNID 104185, 104745). For yeast dividing every 100 minutes this is on the order of the number of seconds in its generation time, in agreement with our crude estimate above.

Figure 3: Protein to mRNA ratio in fission yeast. (A) Histogram illustrating the number of mRNA and protein copies as determined using sequencing methods and mass spectrometry, respectively. (B) Plot of protein abundance and mRNA abundance on a gene-by-gene basis. Adapted from S. Marguerat et al., Cell, 151:671, 2012. Recent analysis (R. Milo, Bioessays, 35:1050, 2014) suggests that the protein levels have been underestimated and a correction factor of about 5-fold increase should be applied, thus making the ratio of protein to mRNA closer to 104.

As with many of the quantities described throughout the book, the high-throughput, genome-wide craze has hit the subject of this vignette as well. Specifically, using a combination of RNA-Seq to determine the mRNA copy numbers and mass spectrometry methods and ribosomal profiling to infer the protein content of cells, it is possible to go beyond the specific gene-by-gene estimates and measurements described above. As shown in Figure 3 for fission yeast, the genome-wide distribution of mRNA and protein confirms the estimates provided above showing more than a thousand-fold excess of protein to mRNA in most cases. Similarly, in mammalian cell lines a protein to mRNA ratio of about 10 4 is inferred (BNID 110236).

Figure 4: Dynamics of protein production. (A) Bursts in protein production resulting from multiple rounds of translation on the same mRNA molecule before it decays. (B) Distribution of burst sizes for the protein beta-galactosidase in E. coli. (Adapted from L. Cai et al., Nature, 440:358, 2006.)

So far, we have focused on the total number of protein copies per mRNA and not the number of proteins produced per production burst occurring from a given mRNA. This so-called burst size measurement is depicted in Figure 4, showing for the protein beta-galactosidase in E. coli the distribution of observed burst sizes, quickly decreasing from the common handful to much fewer cases of more than 10.

Finally, we note that there is a third meaning to the question that entitles this vignette, where we could ask how many proteins are made from each individual mRNA before it is degraded. For example, in fast growing E. coli, mRNAs are degraded roughly every 3 minutes as discussed in the vignette on “What is the degradation rates of mRNA and proteins?”. This time scale is some 10-100 times shorter than the cell cycle time. As a result, to move from the statement that the protein to mRNA ratio is typically 1000 to the number of proteins produced from an mRNA before it is degraded we need to divide the number of mRNA lifetimes per cell cycle. We find that in this rapidly dividing E. coli scenario, each mRNA gives rise to about 10-100 proteins before being degraded.

A recent study (G. Csardi et al., PLOS genetics, 2015) suggests revisiting the basic question of this vignette. Careful analysis of tens of studies on mRNA and protein levels in budding yeast, the most common model organism for such studies, suggests a non-linear relation where genes with high mRNA levels will have a higher protein to mRNA ratio than lowly expressed mRNAs. This suggests the correlation between mRNA and protein does not have a slope of 1 in log-log scale but rather a slope of about 1.6 which also explains why the dynamic range of proteins is significantly bigger than that of mRNA.

Spliceosomes, Assembled from snRNPs and a Pre-mRNA, Carry Out Splicing

Even before splicing was accomplished in vitro, several observations led to the suggestion that small nuclear RNAs (snRNAs) assist in the splicing reaction. First, the short consensus sequence at the 5′ end of introns was found to be complementary to a sequence near the 5′ end of the snRNA called U1. Second, snRNAs were found associated with hnRNPs in nuclear extracts. Five U-rich snRNAs (U1, U2, U4, U5, and U6), ranging in length from 107 to 210 nucleotides, participate in RNA splicing.

In the nucleus of eukaryotic cells, snRNAs are associated with six to ten proteins in small nuclear ribonucleoprotein particles (snRNPs). Some of these proteins are common to all snRNPs, and some are specific for individual snRNPs. Experiments with a synthetic oligonucleotide that hybridizes with the 5′-end region of U1 snRNA and later studies with pre-mRNAs that were mutated in the 5′ splice-site consensus sequence provided strong evidence that base pairing between the 5′ splice site of a pre-mRNA and the 5′ region of U1 snRNA is required for RNA splicing.

Involvement of U2 snRNA in splicing initially was suspected when it was found to have an internal sequence that is largely complementary to the consensus sequence flanking the branch point in pre-mRNAs (see Figure 11-14). Mutation experiments, similar to those conducted with U1 snRNA and 5′ splice sites, demonstrated that base pairing between U2 snRNA and the branch-point sequence in pre-mRNA is critical to splicing. These studies with U1 and U2 snRNAs indicate that during splicing they base-pair with pre-mRNA as shown in Figure 11-17. Significantly, the branch- point A itself, which is not base-paired to U2 snRNA, 𠇋ulges out,” allowing its 2′ hydroxyl to participate in the first transesterification reaction of RNA splicing (see Figure 11-16).

Figure 11-17

Diagram of interactions between pre-mRNA, U1 snRNA, and U2 snRNA early in the splicing process. The 5′ region of U1 snRNA initially base-pairs with nucleotides at the 5′ end of the intron (blue) and 3′ end of the 5′ exon (more. )

Similar studies with other snRNAs demonstrated that RNA-RNA interactions involving them also occur during splicing. For example, an internal region of U6 snRNA initially base-pairs with the 5′ end of U4 snRNA. Rearrangements later in the splicing process result in U6 snRNA base pairing with the 5′ end of U2 snRNA, which remains base-paired to the branch-point sequence in the intron. Later in the splicing process, base pairing of U5 snRNA with four exon nucleotides adjacent to the splice sites displaces U1 snRNA from the pre-mRNA.

Based on the results of these experiments, identification of reaction intermediates, and other biochemical analyses, the five splicing snRNPs are thought to sequentially assemble on the pre-mRNA forming a large ribonucleoprotein complex called a spliceosome, which is roughly the size of a ribosome (Figure 11-18). According to the model depicted in Figure 11-19, assembly of a spliceosome begins with the base pairing of U1 and U2 snRNAs, as part of the U1 and U2 snRNPs, to the pre-mRNA (see Figure 11-17). Extensive base pairing between the snRNAs in the U4 and U6 snRNPs forms a complex that associates with U5 snRNP. The U4/U6/U5 complex then associates, presumably via protein-protein interactions, with the previously formed complex consisting of a pre-mRNA base-paired to U1 and U2 snRNPs to yield a spliceosome.

Figure 11-18

Electron micrograph of a spliceosome. Extracts of HeLa cells were mixed with a β-globin pre-mRNA the reaction was interrupted before splicing was completed, so that the spliceosomes, containing snRNPs and the pre-mRNA substrate, could be purified. (more. )

Figure 11-19

The spliceosomal splicing cycle. The splicing snRNPs (U1, U2, U4, U5, and U6) associate with the pre-mRNA and with each other in an ordered sequence to form the spliceosome. This large ribonucleoprotein complex then catalyzes the two transesterification (more. )

After formation of the spliceosome, extensive rearrangements occur in the pairing of snRNAs and the pre-mRNA, as noted previously. The rearranged spliceosome then catalyzes the two transesterification reactions that result in RNA splicing. After the second transesterification reaction, the ligated exons are released from the spliceosome while the lariat intron remains associated with the snRNPs. This final intron-snRNP complex is unstable and dissociates. The individual snRNPs released participate in a new cycle of splicing. The excised intron is rapidly degraded by a �ranching enzyme,” which hydrolyzes the 5′,2′-phosphodiester bond at the branch point, and other nuclear RNases.

It is estimated that at least one hundred proteins are involved in RNA splicing, making this process comparable in complexity to protein synthesis and initiation of transcription. Some of these splicing factors are associated with snRNPs, but others are not. Sequencing of yeast genes encoding splicing factors has revealed that they contain domains with the RNP motif, which interacts with RNA, and the SR motif, which interacts with other proteins and may contribute to RNA binding. Some splicing factors also exhibit sequence homologies to known RNA helicases these may be necessary for the base-pairing rearrangements that occur in snRNAs during the spliceosomal splicing cycle.

Introns whose splice sites do not conform to the standard consensus sequence recently were identified in some pre-mRNAs. This class of introns begins with AU and ends with AC rather than following the usual “GU –𠁚G rule” (see Figure 11-14). Research on the biochemistry of splicing for this special class of introns soon identified four novel snRNPs. Together with the standard U5 snRNP, these snRNPs appear to participate in a splicing cycle analogous to that discussed above.

Recent advances in mRNA vaccine technology

Various mRNA vaccine platforms have been developed in recent years and validated in studies of immunogenicity and efficacy 18,19,20 . Engineering of the RNA sequence has rendered synthetic mRNA more translatable than ever before. Highly efficient and non-toxic RNA carriers have been developed that in some cases 21,22 allow prolonged antigen expression in vivo (Table 1). Some vaccine formulations contain novel adjuvants, while others elicit potent responses in the absence of known adjuvants. The following section summarizes the key advances in these areas of mRNA engineering and their impact on vaccine efficacy.

Optimization of mRNA translation and stability

This topic has been extensively discussed in previous reviews 14,15 thus, we briefly summarize the key findings (Box 1). The 5′ and 3′ UTR elements flanking the coding sequence profoundly influence the stability and translation of mRNA, both of which are critical concerns for vaccines. These regulatory sequences can be derived from viral or eukaryotic genes and greatly increase the half-life and expression of therapeutic mRNAs 23,24 . A 5′ cap structure is required for efficient protein production from mRNA 25 . Various versions of 5′ caps can be added during or after the transcription reaction using a vaccinia virus capping enzyme 26 or by incorporating synthetic cap or anti-reverse cap analogues 27,28 . The poly(A) tail also plays an important regulatory role in mRNA translation and stability 25 thus, an optimal length of poly(A) 24 must be added to mRNA either directly from the encoding DNA template or by using poly(A) polymerase. The codon usage additionally has an impact on protein translation. Replacing rare codons with frequently used synonymous codons that have abundant cognate tRNA in the cytosol is a common practice to increase protein production from mRNA 29 , although the accuracy of this model has been questioned 30 . Enrichment of G:C content constitutes another form of sequence optimization that has been shown to increase steady-state mRNA levels in vitro 31 and protein expression in vivo 12 .

Although protein expression may be positively modulated by altering the codon composition or by introducing modified nucleosides (discussed below), it is also possible that these forms of sequence engineering could affect mRNA secondary structure 32 , the kinetics and accuracy of translation and simultaneous protein folding 33,34 , and the expression of cryptic T cell epitopes present in alternative reading frames 30 . All these factors could potentially influence the magnitude or specificity of the immune response.

Box 1: Strategies for optimizing mRNA pharmacology

A number of technologies are currently used to improve the pharmacological aspects of mRNA. The various mRNA modifications used and their impact are summarized below.

• Synthetic cap analogues and capping enzymes 26,27 stabilize mRNA and increase protein translation via binding to eukaryotic translation initiation factor 4E (EIF4E)

• Regulatory elements in the 5′-untranslated region (UTR) and the 3′-UTR 23 stabilize mRNA and increase protein translation

• Poly(A) tail 25 stabilizes mRNA and increases protein translation

• Modified nucleosides 9,48 decrease innate immune activation and increase translation

• Separation and/or purification techniques: RNase III treatment (N.P. and D.W., unpublished observations) and fast protein liquid chromatography (FPLC) purification 13 decrease immune activation and increase translation

• Sequence and/or codon optimization 29 increase translation

• Modulation of target cells: co-delivery of translation initiation factors and other methods alters translation and immunogenicity

Modulation of immunogenicity

Exogenous mRNA is inherently immunostimulatory, as it is recognized by a variety of cell surface, endosomal and cytosolic innate immune receptors (Fig. 1) (reviewed in Ref. 35). Depending on the therapeutic application, this feature of mRNA could be beneficial or detrimental. It is potentially advantageous for vaccination because in some cases it may provide adjuvant activity to drive dendritic cell (DC) maturation and thus elicit robust T and B cell immune responses. However, innate immune sensing of mRNA has also been associated with the inhibition of antigen expression and may negatively affect the immune response 9,13 . Although the paradoxical effects of innate immune sensing on different formats of mRNA vaccines are incompletely understood, some progress has been made in recent years in elucidating these phenomena.

Innate immune sensing of two types of mRNA vaccine by a dendritic cell (DC), with RNA sensors shown in yellow, antigen in red, DC maturation factors in green, and peptide−major histocompatibility complex (MHC) complexes in light blue and red an example lipid nanoparticle carrier is shown at the top right. A non-exhaustive list of the major known RNA sensors that contribute to the recognition of double-stranded and unmodified single-stranded RNAs is shown. Unmodified, unpurified (part a) and nucleoside-modified, fast protein liquid chromatography (FPLC)-purified (part b) mRNAs were selected for illustration of two formats of mRNA vaccines where known forms of mRNA sensing are present and absent, respectively. The dashed arrow represents reduced antigen expression. Ag, antigen PKR, interferon-induced, double-stranded RNA-activated protein kinase MDA5, interferon-induced helicase C domain-containing protein 1 (also known as IFIH1) IFN, interferon m1Ψ, 1-methylpseudouridine OAS, 2′-5′-oligoadenylate synthetase TLR, Toll-like receptor.

Studies over the past decade have shown that the immunostimulatory profile of mRNA can be shaped by the purification of IVT mRNA and the introduction of modified nucleosides as well as by complexing the mRNA with various carrier molecules 9,13,36,37 . Enzymatically synthesized mRNA preparations contain double-stranded RNA (dsRNA) contaminants as aberrant products of the IVT reaction 13 . As a mimic of viral genomes and replication intermediates, dsRNA is a potent pathogen-associated molecular pattern (PAMP) that is sensed by pattern recognition receptors in multiple cellular compartments (Fig. 1). Recognition of IVT mRNA contaminated with dsRNA results in robust type I interferon production 13 , which upregulates the expression and activation of protein kinase R (PKR also known as EIF2AK2) and 2′-5′-oligoadenylate synthetase (OAS), leading to the inhibition of translation 38 and the degradation of cellular mRNA and ribosomal RNA 39 , respectively. Karikó and colleagues 13 have demonstrated that contaminating dsRNA can be efficiently removed from IVT mRNA by chromatographic methods such as reverse-phase fast protein liquid chromatography (FPLC) or high-performance liquid chromatography (HPLC). Strikingly, purification by FPLC has been shown to increase protein production from IVT mRNA by up to 1,000-fold in primary human DCs 13 . Thus, appropriate purification of IVT mRNA seems to be critical for maximizing protein (immunogen) production in DCs and for avoiding unwanted innate immune activation.

Besides dsRNA contaminants, single-stranded mRNA molecules are themselves a PAMP when delivered to cells exogenously. Single-stranded oligoribonucleotides and their degradative products are detected by the endosomal sensors Toll-like receptor 7 (TLR7) and TLR8 (Refs 40,41), resulting in type I interferon production 42 . Crucially, it was discovered that the incorporation of naturally occurring chemically modified nucleosides, including but not limited to pseudouridine 9,43,44 and 1-methylpseudouridine 45 , prevents activation of TLR7, TLR8 and other innate immune sensors 46,47 , thus reducing type I interferon signalling 48 . Nucleoside modification also partially suppresses the recognition of dsRNA species 46,47,48 . As a result, Karikó and others have shown that nucleoside-modified mRNA is translated more efficiently than unmodified mRNA in vitro 9 , particularly in primary DCs, and in vivo in mice 45 . Notably, the highest level of protein production in DCs was observed when mRNA was both FPLC-purified and nucleoside-modified 13 . These advances in understanding the sources of innate immune sensing and how to avoid their adverse effects have substantially contributed to the current interest in mRNA-based vaccines and protein replacement therapies.

In contrast to the findings described above, a study by Thess and colleagues found that sequence-optimized, HPLC-purified, unmodified mRNA produced higher levels of protein in HeLa cells and in mice than its nucleoside-modified counterpart 12 . Additionally, Kauffman and co-workers demonstrated that unmodified, non-HPLC-purified mRNA yielded more robust protein production in HeLa cells than nucleoside-modified mRNA, and resulted in similar levels of protein production in mice 49 . Although not fully clear, the discrepancies between the findings of Karikó 9,13 and these authors 12,49 may have arisen from variations in RNA sequence optimization, the stringency of mRNA purification to remove dsRNA contaminants and the level of innate immune sensing in the targeted cell types.

The immunostimulatory properties of mRNA can conversely be increased by the inclusion of an adjuvant to increase the potency of some mRNA vaccine formats. These include traditional adjuvants as well as novel approaches that take advantage of the intrinsic immunogenicity of mRNA or its ability to encode immune-modulatory proteins. Self-replicating RNA vaccines have displayed increased immunogenicity and effectiveness after formulating the RNA in a cationic nanoemulsion based on the licensed MF59 (Novartis) adjuvant 50 . Another effective adjuvant strategy is TriMix, a combination of mRNAs encoding three immune activator proteins: CD70, CD40 ligand (CD40L) and constitutively active TLR4. TriMix mRNA augmented the immunogenicity of naked, unmodified, unpurified mRNA in multiple cancer vaccine studies and was particularly associated with increased DC maturation and cytotoxic T lymphocyte (CTL) responses (reviewed in Ref. 51). The type of mRNA carrier and the size of the mRNA–carrier complex have also been shown to modulate the cytokine profile induced by mRNA delivery. For example, the RNActive (CureVac AG) vaccine platform 52,53 depends on its carrier to provide adjuvant activity. In this case, the antigen is expressed from a naked, unmodified, sequence-optimized mRNA, while the adjuvant activity is provided by co-delivered RNA complexed with protamine (a polycationic peptide), which acts via TLR7 signalling 52,54 . This vaccine format has elicited favourable immune responses in multiple preclinical animal studies for vaccination against cancer and infectious diseases 18,36,55,56 . A recent study provided mechanistic information on the adjuvanticity of RNActive vaccines in mice in vivo and human cells in vitro 54 . Potent activation of TLR7 (mouse and human) and TLR8 (human) and production of type I interferon, pro-inflammatory cytokines and chemokines after intradermal immunization was shown 54 . A similar adjuvant activity was also demonstrated in the context of non-mRNA-based vaccines using RNAdjuvant (CureVac AG), an unmodified, single-stranded RNA stabilized by a cationic carrier peptide 57 .

Progress in mRNA vaccine delivery

Efficient in vivo mRNA delivery is critical to achieving therapeutic relevance. Exogenous mRNA must penetrate the barrier of the lipid membrane in order to reach the cytoplasm to be translated to functional protein. mRNA uptake mechanisms seem to be cell type dependent, and the physicochemical properties of the mRNA complexes can profoundly influence cellular delivery and organ distribution. There are two basic approaches for the delivery of mRNA vaccines that have been described to date. First, loading of mRNA into DCs ex vivo, followed by re-infusion of the transfected cells 58 and second, direct parenteral injection of mRNA with or without a carrier. Ex vivo DC loading allows precise control of the cellular target, transfection efficiency and other cellular conditions, but as a form of cell therapy, it is an expensive and labour-intensive approach to vaccination. Direct injection of mRNA is comparatively rapid and cost-effective, but it does not yet allow precise and efficient cell-type-specific delivery, although there has been recent progress in this regard 59 . Both of these approaches have been explored in a variety of forms (Fig. 2 Table 1).

Commonly used delivery methods and carrier molecules for mRNA vaccines along with typical diameters for particulate complexes are shown: naked mRNA (part a) naked mRNA with in vivo electroporation (part b) protamine (cationic peptide)-complexed mRNA (part c) mRNA associated with a positively charged oil-in-water cationic nanoemulsion (part d) mRNA associated with a chemically modified dendrimer and complexed with polyethylene glycol (PEG)-lipid (part e) protamine-complexed mRNA in a PEG-lipid nanoparticle (part f) mRNA associated with a cationic polymer such as polyethylenimine (PEI) (part g) mRNA associated with a cationic polymer such as PEI and a lipid component (part h) mRNA associated with a polysaccharide (for example, chitosan) particle or gel (part i) mRNA in a cationic lipid nanoparticle (for example, 1,2-dioleoyloxy-3-trimethylammoniumpropane (DOTAP) or dioleoylphosphatidylethanolamine (DOPE) lipids) (part j) mRNA complexed with cationic lipids and cholesterol (part k) and mRNA complexed with cationic lipids, cholesterol and PEG-lipid (part l).

Ex vivo loading of DCs. DCs are the most potent antigen-presenting cells of the immune system. They initiate the adaptive immune response by internalizing and proteolytically processing antigens and presenting them to CD8 + and CD4 + T cells on major histocompatibility complexes (MHCs), namely, MHC class I and MHC class II, respectively. Additionally, DCs may present intact antigen to B cells to provoke an antibody response 60 . DCs are also highly amenable to mRNA transfection. For these reasons, DCs represent an attractive target for transfection by mRNA vaccines, both in vivo and ex vivo.

Although DCs have been shown to internalize naked mRNA through a variety of endocytic pathways 61,62,63 , ex vivo transfection efficiency is commonly increased using electroporation in this case, mRNA molecules pass through membrane pores formed by a high-voltage pulse and directly enter the cytoplasm (reviewed in Ref. 64). This mRNA delivery approach has been favoured for its ability to generate high transfection efficiency without the need for a carrier molecule. DCs that are loaded with mRNA ex vivo are then re-infused into the autologous vaccine recipient to initiate the immune response. Most ex vivo-loaded DC vaccines elicit a predominantly cell-mediated immune response thus, they have been used primarily to treat cancer (reviewed in Ref. 58).

Injection of naked mRNA in vivo. Naked mRNA has been used successfully for in vivo immunizations, particularly in formats that preferentially target antigen-presenting cells, as in intradermal 61,65 and intranodal injections 66,67,68 . Notably, a recent report showed that repeated intranodal immunizations with naked, unmodified mRNA encoding tumour-associated neoantigens generated robust T cell responses and increased progression-free survival 68 (discussed further in Box 2).

Physical delivery methods in vivo. To increase the efficiency of mRNA uptake in vivo, physical methods have occasionally been used to penetrate the cell membrane. An early report showed that mRNA complexed with gold particles could be expressed in tissues using a gene gun, a microprojectile method 69 . The gene gun was shown to be an efficient RNA delivery and vaccination method in mouse models 70,71,72,73 , but no efficacy data in large animals or humans are available. In vivo electroporation has also been used to increase uptake of therapeutic RNA 74,75,76 however, in one study, electroporation increased the immunogenicity of only a self-amplifying RNA and not a non-replicating mRNA-based vaccine 74 . Physical methods can be limited by increased cell death and restricted access to target cells or tissues. Recently, the field has instead favoured the use of lipid or polymer-based nanoparticles as potent and versatile delivery vehicles.

Protamine. The cationic peptide protamine has been shown to protect mRNA from degradation by serum RNases 77 however, protamine-complexed mRNA alone demonstrated limited protein expression and efficacy in a cancer vaccine model, possibly owing to an overly tight association between protamine and mRNA 36,78 . This issue was resolved by developing the RNActive vaccine platform, in which protamine-formulated RNA serves only as an immune activator and not as an expression vector 52 .

Cationic lipid and polymer-based delivery. Highly efficient mRNA transfection reagents based on cationic lipids or polymers, such as TransIT-mRNA (Mirus Bio LLC) or Lipofectamine (Invitrogen), are commercially available and work well in many primary cells and cancer cell lines 9,13 , but they often show limited in vivo efficacy or a high level of toxicity (N.P. and D.W., unpublished observations). Great progress has been made in developing similarly designed complexing reagents for safe and effective in vivo use, and these are discussed in detail in several recent reviews 10,11,79,80 . Cationic lipids and polymers, including dendrimers, have become widely used tools for mRNA administration in the past few years. The mRNA field has clearly benefited from the substantial investment in in vivo small interfering RNA (siRNA) administration, where these delivery vehicles have been used for over a decade. Lipid nanoparticles (LNPs) have become one of the most appealing and commonly used mRNA delivery tools. LNPs often consist of four components: an ionizable cationic lipid, which promotes self-assembly into virus-sized (

100 nm) particles and allows endosomal release of mRNA to the cytoplasm lipid-linked polyethylene glycol (PEG), which increases the half-life of formulations cholesterol, a stabilizing agent and naturally occurring phospholipids, which support lipid bilayer structure. Numerous studies have demonstrated efficient in vivo siRNA delivery by LNPs (reviewed in Ref. 81), but it has only recently been shown that LNPs are potent tools for in vivo delivery of self-amplifying RNA 19 and conventional, non-replicating mRNA 21 . Systemically delivered mRNA–LNP complexes mainly target the liver owing to binding of apolipoprotein E and subsequent receptor-mediated uptake by hepatocytes 82 , and intradermal, intramuscular and subcutaneous administration have been shown to produce prolonged protein expression at the site of the injection 21,22 . The mechanisms of mRNA escape into the cytoplasm are incompletely understood, not only for artificial liposomes but also for naturally occurring exosomes 83 . Further research into this area will likely be of great benefit to the field of therapeutic RNA delivery.

The magnitude and duration of in vivo protein production from mRNA–LNP vaccines can be controlled in part by varying the route of administration. Intramuscular and intradermal delivery of mRNA–LNPs has been shown to result in more persistent protein expression than systemic delivery routes: in one experiment, the half-life of mRNA-encoded firefly luciferase was roughly threefold longer after intradermal injection than after intravenous delivery 21 . These kinetics of mRNA–LNP expression may be favourable for inducing immune responses. A recent study demonstrated that sustained antigen availability during vaccination was a driver of high antibody titres and germinal centre (GC) B cell and T follicular helper (TFH) cell responses 84 . This process was potentially a contributing factor to the potency of recently described nucleoside-modified mRNA–LNP vaccines delivered by the intramuscular and intradermal routes 20,22,85 . Indeed, TFH cells have been identified as a critical population of immune cells that vaccines must activate in order to generate potent and long-lived neutralizing antibody responses, particularly against viruses that evade humoral immunity 86 . The dynamics of the GC reaction and the differentiation of TFH cells are incompletely understood, and progress in these areas would undoubtedly be fruitful for future vaccine design (Box 3).

Box 2: Personalized neoepitope cancer vaccines

Sahin and colleagues have pioneered the use of individualized neoepitope mRNA cancer vaccines 121 . They use high-throughput sequencing to identify every unique somatic mutation of an individual patient's tumour sample, termed the mutanome. This enables the rational design of neoepitope cancer vaccines in a patient-specific manner, and has the advantage of targeting non-self antigen specificities that should not be eliminated by central tolerance mechanisms. Proof of concept has been recently provided: Kreiter and colleagues found that a substantial portion of non-synonymous cancer mutations were immunogenic when delivered by mRNA and were mainly recognized by CD4 + T cells 176 . On the basis of these data, they generated a computational method to predict major histocompatibility complex (MHC) class II-restricted neoepitopes that can be used as vaccine immunogens. mRNA vaccines encoding such neoepitopes have controlled tumour growth in B16-F10 melanoma and CT26 colon cancer mouse models. In a recent clinical trial, Sahin and colleagues developed personalized neoepitope-based mRNA vaccines for 13 patients with metastatic melanoma, a cancer known for its high frequency of somatic mutations and thus neoepitopes. They immunized against ten neoepitopes per individual by injecting naked mRNA intranodally. CD4 + T cell responses were detected against the majority of the neoepitopes, and a low frequency of metastatic disease was observed after several months of follow-up 68 . Interestingly, similar results were also obtained in a study of analogous design that used synthetic peptides as immunogens rather than mRNA 177 . Together, these recent trials suggest the potential utility of the personalized vaccine methodology.

Box 3: The germinal centre and T follicular helper cells

The vast majority of potent antimicrobial vaccines elicit long-lived, protective antibody responses against the target pathogen. High-affinity antibodies are produced in specialized microanatomical sites within the B cell follicles of secondary lymphoid organs called germinal centres (GCs). B cell proliferation, somatic hypermutation and selection for high-affinity mutants occur in the GCs, and efficient T cell help is required for these processes 178 . Characterization of the relationship between GC B and T cells has been actively studied in recent years. The follicular homing receptor CXC-chemokine receptor 5 (CXCR5) was identified on GC B and T cells in the 1990s 179,180 , but the concept of a specific lineage of T follicular helper (TFH) cells was not proposed until 2000 (Refs 181, 182). The existence of the TFH lineage was confirmed in 2009 when the transcription factor specific for TFH cells, B cell lymphoma 6 protein (BCL-6), was identified 183,184,185 . TFH cells represent a specialized subset of CD4 + T cells that produce critical signals for B cell survival, proliferation and differentiation in addition to signals for isotype switching of antibodies and for the introduction of diversifying mutations into the immunoglobulin genes. The major cytokines produced by TFH cells are interleukin-4 (IL-4) and IL-21, which play a key role in driving the GC reaction. Other important markers and functional ligands expressed by TFH cells include CD40 ligand (CD40L), Src homology domain 2 (SH2) domain-containing protein 1A (SH2D1A), programmed cell death protein 1 (PD1) and inducible T cell co-stimulator (ICOS) 186 . The characterization of rare, broadly neutralizing antibodies to HIV-1 has revealed that unusually high rates of somatic hypermutation are a hallmark of protective antibody responses against HIV-1 (Ref. 187). As TFH cells play a key role in driving this process in GC reactions, the development of new adjuvants or vaccine platforms that can potently activate this cell type is urgently needed.

MRNA Splicing

Transcription and processing (which includes splicing) of the newly made mRNA occurs in the nucleus of the cell.
Once a mature mRNA transcript is made it is transported to the cytoplasm for translation into protein.

Figure (PageIndex<1>). (CC BY-NC-SA)

Most eukaryotic genes and their pre-mRNA transcripts contain noncoding stretches of nucleotides or regions that are not meant to be made into protein. These noncoding segments are called intronsand must be removed before the mature mRNA can be transported to the cytoplasm and translated into protein. The stretches of DNA that do code for amino acids in the protein are called exons. During the process of splicing, introns are removed from the pre-mRNA by the spliceosome and exons are spliced back together. If the introns are not removed, the RNA would be translated into a nonfunctional protein. Splicing occurs in the nucleus before the RNA migrates to the cytoplasm. Once splicing is complete, the mature mRNA (containing uninterrupted coding information), is transported to the cytoplasm where ribosomes translate the mRNA into protein.

A Detailed Look at mRNA Splicing

The pre-mRNA Transcript
The pre-mRNA transcript contains both introns and exons. The introns are removed during the process of splicing. In this example, the pre-mRNA contains two exons and one intron.

Introns contain several important and conserved sequences that guide the splicing process: a 5&rsquo GU sequence (the 5&rsquo splice site), an A branch site located near a pyrimidine-rich region (a region with many cytosine and uracil bases) and a 3&rsquo AG sequence (the 3&rsquo splice site).

The Spliceosome
A large protein complex known as the spliceosome controls mRNA splicing. The spliceosome is composed of particles made up of both RNA and protein. These particles are called small nuclear ribonucleoprotein or snRNPs (pronounced &ldquosnurps&rdquo) for short. The snRNPs recognize the conserved sequences within introns and quickly bind these sequences once the pre-mRNA is made and initiate splicing.

The spliceosome is built in distinct steps. First, the U1 snRNP binds the 5&rsquo splice site and the U2 snRNP binds the branch site.

A number of other snRNPs (U4, U6 and U5) bind the pre-mRNA transcript forming the mature spliceosome complex. This causes the intron to form a loop and brings the 5&rsquo splice site and 3&rsquo splice site together.

Now that the spliceosome is assembled, splicing can begin. First the 5&rsquo end of the intron is cut. The 5&rsquo GU end of the intron is then connected to the A branch site, which creates a lariat structure.

At this stage the U1 and U4 snRNPs are released and the 3&rsquo splice site is cleaved. Once the intron has been fully cleaved, the two exons are attached to each other. The intron in the form of a lariat is released along with U2, U5 and U6 snRNPs.

The intron will be degraded and the snRNPs are used again to splice other pre-mRNAs. The mature mRNA transcript is now ready to be exported to the cytoplasm for translation.

Alternative Splicing

The example of a gene with a single intron and two exons used above is a very simple model of RNA splicing. Many genes contain multiple exons as well as multiple introns. A process known asalternative splicing allows for different combinations of exons to be included in the final mature mRNA, making different versions of proteins (called isoforms) that are all encoded by the same gene. Alternative splicing of mRNA allows for many proteins to be made, with different functions, all produced from a single gene. One of the most dramatic examples of alternative splicing is the Dscam gene in Drosophila melanogaster (a fruit fly). This single gene contains 116 exons! Some exons are always included, others may or may not be included. Over 18,000 different proteins from this single gene have been found in Drosophila! Theoretically, this system is capable of producing 38,016 different proteins all from a single gene!

Below is an example of alternative splicing of a pre-mRNA transcript. In this case, there are two different, alternatively spliced mRNAs that can be made from this pre-mRNA. The two mature mRNAs can contain either the yellow or the green exon. This produces two distinct protein isoforms when the mRNAs are translated into protein.

mRNA Splicing Tutorial by Dr. Katherine Harris is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Watch the video: Eukaryotic Translation Protein Synthesis, Animation. (February 2023).