BIOCHEMISTRY DISCUSSION 9

profileLeahh
37GeneExpressioninEukaryotes.pdf

All right - so now we’ve covered the basics of gene expression in RNA synthesis in prokaryotes. Now we’re going to move to eukaryotes, which are very similar, but has a few significant differences.

1

Created by Brett Barbaro

Biochemistry: A Short Course

Fourth Edition

CHAPTER 37 Gene Expression in

Eukaryotes

Tymoczko • Berg • Gatto • Stryer

© 2019 W. H. Freeman and Company.

Two haploid cells join to form a single diploid cell at fertilization. This joining initiates an astounding biochemical process—the parceling out of genetic information, in the process of development, to construct, for example, the person reading these words. The regulation of gene transcription is a key component of development. [Don W. Fawcett/Science Source.]

Here we have a composite photo of a human sperm and egg meeting. And in this picture is contained all of the genetic information that would be present in a human being. This cell will divide, and then divide again, and continue dividing until it is all of the cells that are in a human body. And all of these cells contain all of that genetic information. But the difference is - which genes are going to be expressed? A different set of genes gets expressed in the brain, versus the liver, versus the heart, versus the muscle. And a key component of which genes are expressed is their transcription. So in this chapter, we’re going to cover the mechanisms that regulate the transcription of these genes.

2

Created by Brett Barbaro

CHAPTER 37 Gene Expression in Eukaryotes

The eukaryotic cells, like our own, have 3 different types of RNA polymerases, and they’re named Polymerase I, II, and III. We’re going to spend special attention on Polymerase II, because that’s the one that is responsible for making messenger RNA, which gets translated into protein. So the regulation of those mRNAs are the most complex, and, one might argue, important forms of regulation in the cell. It’s basically what determines which genes are activated and expressed and which are not. Some of this regulation could be mediated by hormones, and those get passed around the whole body, and tell cells which genes to start expressing, and which ones to stop. And another mechanism we’ll talk about it the addition of acetyl groups to histones. Remember - histones are the elements that bind up the DNA, and the attachment of an acetyl group releases the DNA from that histone and exposes it for transcription by the polymerase.

3

Created by Brett Barbaro

Chapter 37: Outline

37.1 Eukaryotic Cells Have Three Types of RNA Polymerases

37.2 RNA Polymerase II Requires Complex Regulation 37.3 Gene Expression Is Regulated by Hormones 37.4 Histone Acetylation Results in Chromatin

Remodeling

Figure 37.1 Transcription and translation. (A) In prokaryotes, the primary transcript serves as mRNA and is used immediately as the template for protein synthesis. (B) In eukaryotes, mRNA precursors are processed and spliced in the nucleus before being transported to the cytoplasm for translation. [After J. Darnell, H. Lodish, and D. Baltimore, Molecular Cell Biology, 2nd ed. (Scientific American Books, 1990), p. 230.]

So one of the main differences between the prokaryotes and the eukaryotes is that the eukaryotes are often multicellular organisms. This means that they have very different cell types involved in the same organism, and therefore they need to regulate the transcription of the genes very carefully. The regulation of transcription in eukaryotes is much more complicated than in prokaryotes. You can see in the diagram on the left, {in prokaryotes} basically an mRNA gets transcribed from the DNA and then immediately starts being translated by the ribosomes into proteins. Whereas on the right hand side you can see there are a few more steps involved in the process. First, there is a transcription of the genes, and that is much more complicated than it is in prokaryotes. The second is that there is some processing that occurs to the mRNA before it even leaves the nucleus. And of course, the third thing is the nucleus itself separates the translation of the genes from the transcription. And this delay, including the processing of the RNA, is a time in which a great deal of the regulation takes place. We did discuss the processing of the RNA a little bit in previous lectures, when we were talking about the RNA editing, but we’ll get to that in the next chapter in more detail.

4

Created by Brett Barbaro

Differential Gene Regulation • Multicellular organisms use differential gene regulation to generate different

cell types.

• Gene expression in eukaryotes is influenced by three important characteristics:

1. More complex transcriptional regulation

2. RNA processing, including extensive processing of mRNA precursors

3. The nuclear membrane, which separates the site of RNA synthesis from that of protein synthesis

So there are 3 different types of RNA polymerase that we’re going to discuss, and they are very similar in their structure (except for polymerase II, which is responsible for making mRNA - it has some special domains for its regulation). They are all guided and regulated by promoters in the sequence of the DNA, just like in prokaryotes. But these eukaryotic promoters are little bit more complicated - different elements that are involved. And proteins bind to these promoter elements - and in eukaryotes we call these proteins transcription factors - and those regulate the activity of the polymerases.

5

Created by Brett Barbaro

Section 37.1 Eukaryotic Cells Have Three Types of RNA Polymerases

• RNA synthesis is catalyzed by three RNA polymerases that differ in DNA substrate specificity, location, and sensitivity to the toxin α- amanitin.

• All of the polymerases are similar in structure, but RNA polymerase II has a unique domain, called the carboxyl-terminal domain, that plays an important regulatory role.

• Eukaryotic promoters, also called cis-acting elements, are more complicated than bacterial promoters. Each type of polymerase has distinct promoters.

• The promoters bind to proteins, called trans-acting elements or transcription factors, that regulate polymerase activity.

DID YOU KNOW? Cis-acting elements are DNA sequences that regulate the expression of a gene located on the same molecule of DNA.

Trans-acting elements are proteins that recognize cis-acting elements and regulate RNA synthesis. They are commonly called transcription factors.

Figure 37.2 RNA polymerase poison. α-Amanitin is produced by the poisonous mushroom Amanita phalloides, also called the death cap or the destroying angel. More than a hundred deaths result worldwide each year from the ingestion of poisonous mushrooms. [Jacana/Science Source.]

So we’ve mentioned that the RNA polymerases in eukaryotes are very similar. There’s Type I, II, and III. The Type I RNA polymerase is inside the nucleolus, {which is} inside the nucleus of the cell, and that’s where your rRNA (your ribosomal RNA) is made. The other 2 are outside of the {nucleolus – but still inside the nucleus!} and may make mRNA and other regulatory RNAs, tRNAs, and some ribosomal RNA. We’ve mentioned that the structures of them all very similar, but not the same. For example, there is a molecule called α-Amanitin, and this molecule is produced by mushrooms, for example specifically the “death cap” mushroom, which you see pictured here. Don’t ever eat one of those! This α-Amanitin will interfere with the RNA polymerase II, but not so much the polymerase I or III.

6

Created by Brett Barbaro

Table 37.1 Eukaryotic RNA polymerases

http://www.rcsb.org/pdb/education_discussion/molecule_of_the_month/images/1k83.gif

ALL ARE IN THE NUCLEUS

So talking about the promoters that affect RNA polymerases.

RNA polymerase I, which is in the nucleolus and transcribes genes for ribosomal RNA, has a promoter called the ribosomal initiator element, or rInr. And that’s just upstream of the initiation site. And remember upstream means 5’ along the coding sequence. So a little bit before the coding sequence would start for that RNA. And then there’s another element, that is called the upstream promoter element (UPE), which is approximately 150 bp upstream.

RNA polymerase II, which makes mRNA in the nucleus, has a whole bunch of different promoters that are involved in regulating the transcription of these genes. But one important one is called the TATA box. And as you might guess, that is the DNA sequence T A T A. Which is similar to the bacterial promoter as well, and exists near the start site of transcription. But there are a number of sequences called enhancers that exist at varying distances from transcription start sites. Some of them can be quite far away.

RNA polymerase III actually has some promoters that are inside the genes to be transcribed.

7

Created by Brett Barbaro

RNA Polymerases • RNA polymerase I, located in nucleoli, transcribes genes for

rRNA. One promoter, the ribosomal initiator element (rInr), lies at the transcriptional start site, whereas the upstream promoter element (UPE) lies approximately 150 bp upstream.

• RNA polymerase II, the catalyst for mRNA synthesis, is controlled by a wide array of promoters, including the TATA box and enhancers.

• RNA polymerase III responds to promoters in the genes to be transcribed, such as those encoding tRNA and 5S ribosomal RNA.

Now let’s talk a little bit about other types of RNA which can exist. And there are several different types of RNA that are used as regulatory elements for the transcription of other genes. Small nuclear RNA {snRNA} is an important component of the RNA splicing machinery. And we’ve talked about RNA splicing before, we’re going to talk about it a little bit more, later. But it’s actually a catalytic element in the RNA splicing machinery. Small nucleolar RNA {snoRNA}, which is also inside the nucleus, is important for the rRNA biogenesis and the modification, probably the modification of {rRNAs}. MicroRNAs {miRNAs} can complement existing mRNAs and cause them to be degraded or possibly expressed at higher or lower rates. Small interfering RNA {siRNA} is an antiviral defense, which involves pairing up with viral RNA which has been injected into the cell, and then targeting it for destruction. And then there’s piwi-interacting RNA {piRNA} and long noncoding RNA {lncRNA} which - their functions are important for gene regulation, but they’re not terribly well understood yet. This is an active area of research. And it’s also an extraordinarily complex area of research, because you have all kinds of RNA that’s being produced at the same time, and some RNAs are interfering with others, and some RNAs are building structures with others. And it’s a feedback mechanism, so that it’s all very interwoven and complicated. So that’s one of the things we’re really trying to figure out right now, is all of the different ways that RNA interacts in order to regulate the gene expression.

8

Created by Brett Barbaro

Table 37.2 Additional Classes of RNA • A variety of small RNAs play important roles in processing

and regulating the products of the three polymerases.

Figure 37.3 Common eukaryotic promoter elements. Each eukaryotic RNA polymerase recognizes a set of promoter elements—sequences in DNA that promote transcription. The RNA polymerase II promoter includes an initiator element (Inr) and may also include either a TATA box or a downstream promoter element (DPE). Separate from the promoter region, enhancer elements bind specific transcription factors. The RNA polymerase I promoter also consists of a ribosomal initiator (rInr) and an upstream promoter element (UPE). RNA polymerase III promoters consist of conserved sequences that lie within the transcribed genes.

So let’s talk a little bit about promoters. We see here on the top the TATA box - that’s the big promoter that everybody talks about more than anything else. It is very important. To the right of it you see an initiator element (Inr) and that sometimes exists. And we’ll go into the roles that these individual sequences play in RNA polymerase II promotion. But also see the enhancer there - there’s a double line, meaning that the enhancer can be quite far from the promoter sequence. And not only can it be upstream, but it can be downstream, even inside the genes that are to be transcribed. So enhancer elements encompass a very wide range of regulatory elements.

A second type of promoter can be the initiator element (Inr) combined with a downstream promoter element (DPE), which you see there in the second diagram.

9

Created by Brett Barbaro

Diagram of Common Eukaryotic Promoter Elements

Inr

rInr

Inr

Downstream promoter element just means that it’s just downstream of the initiation site.

Then RNA polymerase I promoter sequence usually involves an initiator element, the rInr, which means ribosomal initiator, because it’s ribosomal RNA that is created by the RNA polymerase I. There’s also an upstream promoter element, and ribosomal RNA can be several different sequences that are transcribed at once and then edited afterward.

The promoters for RNA polymerase III are found inside the genes to be transcribed, and sometimes they’re close to the front of the gene, sometimes they’re spread close to the front and close to the end. But we’re not going to go into too much detail on all of this. This is also not a complete list of all of the different sort of promoter elements. So it’s better just to get an idea of the ways all of the promoter elements can show up in genes. And yes, the promoter elements would be considered part of the gene they are promoting.

9

So let’s talk specifically about RNA polymerase II, because that’s the one that transcribes the mRNAs, and the mRNAs are the ones that eventually turn into proteins. So these need to be very carefully regulated. As we mentioned, the TATA box, which is the sequence T A T A, is located about negative 24 to negative 32 base pairs upstream of the initiation site. And the initiator element is very close to the initiation site – it can actually correspond with the initiation site between negative 3 and positive 5. So zero is where the initiation site would be, and negative would be upstream, positive downstream. Speaking of downstream, the downstream core promoter element {DPE} can be 28 to 35 base pairs downstream, and is often there when there is no TATA box. A couple of other well known regulatory elements are the CAAT box and the GC box, which are quite a bit more upstream, negative 40 to negative 150 base pairs upstream. The GC box is common in genes that are continuously expressed, and there are several of those, which are important for the basic functions of the cells. And the CAAT box and GC box can be on either strand of the DNA, which is a little bit different than some of the other elements, which need to be on one strand or the other.

10

Created by Brett Barbaro

Section 37.2 RNA Polymerase II Requires Complex Regulation

Learning objective 3: Describe how transcription is regulated in eukaryotes. • Common promoters for RNA polymerase II include:

1. The TATA box, which is located around −25 bp upstream of the initiation site.

2. The initiator element (Inr), located around +1, is often paired with the TATA box.

3. The downstream core promoter element (DPE), located around +30, works in cooperation with Inr when the TATA box is absent.

4. Other regulatory elements are the CAAT box and GC box, located between −40 and −150 bp upstream. The GC box is common in genes that are continuously expressed. The CAAT box and the GC box can be located on either strand of the DNA.

Figure 37.4 The TATA box. Comparisons of the sequences of more than 100 eukaryotic promoters led to the consensus sequence shown. The subscripts denote the frequency (%) of the base at that position.

Let’s talk a little bit about the TATA box. The sequence T A T A is where the TATA box gets its name from. There are a couple of other A’s that are often found on the downstream side of the TATA box. And why they call it a box, I don’t know, it doesn’t really look boxy at all. It’s just a common way of referring to a short sequence of DNA that is pretty common. You’ll see along the top, the percentages of the time that the individual bases are present in the DNA sequence - because, remember that these are consensus sequences, so it’s not always TATA, but 82% of the time the first letter is a T, 97% of the time the second letter is an A, so that one’s very important. The third one is a T 93% of the time, so that one’s obviously very important too, and on down the line. Those numbers show you what percentage of the time these bases are present in the TATA box. There’s another common way of showing these consensus sequences, and actually much more common than the method that is shown at the top of the page. You can see here a representation of the TATA box, and the size of the letter tells you how frequently it’s found in that position. So this diagram would indicate that the A T A in the 2, 3, 4 position are the most common, and then the T in the 1 position, and then the A and the A in the 5 and 6 positions are less common, but still quite common. Now if there is no T in first position, what’s the most likely

11

Created by Brett Barbaro

Diagram of the TATA Box

The TATA box is located between positions -24 and -32 bp upstream of the initiation site.

http://genomics.imim.es/~malba/MASTER/TATA_logo.png

This is a common way of representing a consensus sequence:

other one? And that would be the G, ‘cause you see it squished down there at the bottom. And likewise for positions 5 and 6, you would see G and T would be the two other next most likely bases at those positions. I’m not totally sure why the numbers don’t really correspond to the ones at the top - these could be in different organisms, or could be referring to a subset of genes in the same organism. There are a lot of reasons why that might be different.

11

Figure 37.5 The CAAT and GC boxes. Consensus sequences for the CAAT and GC boxes of eukaryotic promoters for mRNA precursors. N signifies that any nucleotide can occupy the position.

Here’s just a quick look at the CAAT box and GC box, and some of the surrounding consensus elements. You’ll see an N up at the top, next to the CAAT box, that means that it could be any nucleotide. There’s no preferred nucleotide, at that position, but the other positions do have the preferred nucleotides that are indicated.

12

Created by Brett Barbaro

Diagram of the CAAT and GC Boxes

Now the proteins that bind to these promoter sequences are called “transcription factors”. And there are a whole bunch of them. We have the ones that are associated with RNA polymerase II, and those are known as the TFII transcription factors. And there are TFIIA, TFIIB…the one that binds directly to the TATA box is part of TFIID, and it’s called the TATA box binding protein, or TBP. And that one’s obviously very important for the transcription of a lot of genes. There are also a lot of other transcription factors that we’re not going to get into. There’s a whole mess of them! But we’re not going to discuss all of them in detail, it’s just good to know that there are all of these different ones, such as the transcription factor TFIIH, which binds to the whole complex, and phosphorylates the carboxyl terminal domain of the polymerase.

13

Created by Brett Barbaro

The Transcription Factor IID Protein Complex Initiates the Assembly of the

Active Transcription Complex • Proteins called transcription factors bind to promoters to regulate

gene expression. The set of transcription factors for RNA polymerase II are collectively known as TFII and individually as TFIIA, TFIIB, etc.

• In genes containing a TATA box, the TATA-box-binding protein (TBP), a component of TFIID, binds to the TATA box, nucleating the formation of the the preinitiation complex (PIC). Other transcription factors bind to generate the PIC.

• TFIIH binds to the complex, completing the formation of the PIC. TFIIH is an ATP-dependent helicase that unwinds the DNA in order for transcription to occur. TFIIH also phosphorylates the carboxyl-terminal domain of the polymerase, facilitating the transition from initiation phase to the elongation phase.

Figure 37.6 Transcription initiation. Transcription factors TFIIA, B, D, E, F, and H are essential in initiating transcription by RNA polymerase II. The step-by-step assembly of these general transcription factors begins with the binding of TFIID (purple) to the TATA box. (The TATA-box-binding protein, or TBP, a component of TFIID, recognizes the TATA box.) TFIIA then joins the complex followed by TFIIB. TFIIF, RNA polymerase II, TFIIE, and TFIIH bind the complex sequentially. After this assembly, TFIIH opens the DNA double helix and phosphorylates the carboxyl-terminal domain (CTD), allowing the polymerase to leave the promoter and begin transcription. The red arrow marks the transcription start site.

So here’s just an example of how these things might work. You’ve got your TATA box up a the top, and the TBP (TATA box binding protein) binds to that, and it’s an element of the transcription factor IID. The next to attach is A, and then B, then you’ve got F, E, and H. So they don’t attach in alphabetical order, like you’d think they might. But once this has all been assembled, then RNA polymerase II itself will attach and begin to transcribe the DNA.

14

Created by Brett Barbaro

Diagram of Transcription Initiation

Figure 37.7 The complex formed by the TATA-box-binding protein and DNA. The saddlelike structure of the protein sits atop a DNA fragment. Notice that the DNA is significantly unwound and bent. [Drawn from 1CDW.pdb by Adam Steinberg.]

Here’s an up-close look at the TATA box binding protein, the TBP, bound to the DNA. This is the figure that is in the text, but I don’t think it shows the bending of the DNA very well. So let’s take a look the next one.

15

Created by Brett Barbaro

Diagram of the Complex Formed by the TATA-Box-Binding Protein and DNA

Here you can clearly see that when TATA box binding protein is bound to the DNA, it bends it almost at a 90 degree angle it looks like.

16

Here are space-filling and stick representations of that binding process. Up at the top, you can see 4 or 5 lysine or arginine residues, that are colored dark blue. And remember - lysine and arginine residues tend to be positively charged. The backbone of the DNA is negatively charged, so the lysine and arginine facilitate the interaction between the backbone of the DNA and the binding protein. In the bottom diagram there, in the stick representation, you can see there are 4 phenylalanine groups that jam into the minor groove of the DNA and cause it to kink, which makes it bend like it does. There’s also 2 asparagine amino acids that form hydrogen bonds at the center of the molecule. Remember this is actually a very weakly bound portion of the DNA, because it’s all formed of thymine and adenine bonds, which is only 2 bonds between those residues as opposed to G-C bonds, which have 3 hydrogen bonds between them.

17

Now let’s talk a little bit more about enhancer sequences. Those can be very far away from the transcription start site - but that’s far away in the sequence. They are actually folded back, so that they interact with the transcription machinery - and so, in 3-dimensional space, even though the enhancer sites might be far away from the transcription start site, in 3-dimensional space they are right next to each other. And just like promoter sequences, enhancer sequences have specific proteins that bind to them and affect the transcription of the DNA.

18

Created by Brett Barbaro

Enhancer Sequences Can Stimulate Transcription at Start Sites Thousands of

Bases Away • Enhancer sequences are cis-acting elements that

have no promoter activity but can stimulate the effectiveness of promoters even when located thousands of nucleotides from the start site.

• Enhancers operate in conjunction with specific enhancer-binding proteins called transcription activators.

Now, enhancers can be extremely important in DNA transcription, and can actually really affect the rate at which the DNA is transcribed - and that can cause some serious problems if the DNA gets rearranged. And sometimes that happens. There are events called chromosomal translocations, where 2 chromosomes will actually recombine, and part of one chromosome will switch places with part of another chromosome, as you’ll see in the diagram on the bottom. You’ll see chromosomes 8 and 14, when they’re normal, have the orange and blue sections at the bottom. And those orange and blue sections can be reversed. Now, normally, on chromosome 14, there is the immunoglobulin H - an enhancer, which is responsible for enhancing the production of genes in that blue portion. And that is a very powerful enhancer, so it causes lots of production of that immunoglobulin. If that gets switched with the myc gene, then it starts enhancing the production of the myc gene. And the myc gene is a transcription factor itself, so that can have a lot of downstream effects. If you produce a whole lot of this transcription factor, you are going to up-regulate the transcription of all of the genes that it’s responsible for. And in this case this leads often to cancers, such as Burkitt’s lymphoma and B-cell leukemia. So enhancer elements can be really important.

19

Created by Brett Barbaro

Clinical Insight: Inappropriate Enhancer Use May Cause Cancer

CLINICAL INSIGHT Inappropriate Enhancer Use May Cause Cancer

• Chromosomal translocations can sometimes place a gene under the control of a powerful enhancer.

• For instance, dysregulation of the gene myc, a transcription factor, resulting from the translocation of an enhancer to a region near the myc gene, plays a role in the development of Burkitt lymphoma and B-cell leukemia.

http://image.slidesharecdn.com/20 071002-120823040324- phpapp01/95/20071002-47- 728.jpg?cb=1345694932

So quick quiz: What is the difference between a promoter and enhancer? I’ll give you a second. The main difference is that you’ve got the promoters are very close to the transcription start site, enhancers can be far away. Now of course, promoters bind to different proteins than enhancers do, and promoters are often responsible for the assembly of the RNA polymerase. And enhancers are more responsible for triggering the beginning of RNA polymerization, but are not responsible for the actual assembly of the polymerase.

20

Created by Brett Barbaro

Quick Quiz 1

QUICK QUIZ 1 Differentiate between a promoter and an enhancer.

Much like we discussed with DNA replication and synthesis, there are a whole bunch of proteins involved in RNA synthesis, more so than in DNA synthesis. And all of these proteins come together and are involved in regulating the speed at which the genes are transcribed, and also which genes are transcribed. And the existence of all of these regulatory factors means that there can be a lot of very specific control over these genes. You can have 9 out of 10 elements necessary for the transcription of a gene, but that 10th element is not there, which means that the gene won’t be transcribed. So it’s basically waiting for everything to be in the right place at the right time, and then the gene will be expressed. An example of this might be cell division. Before dividing, the cell needs to be a certain size, it needs to have all the DNA inside of it replicated, and also a great deal of the other elements have to be assembled in the right place. And each one of these factors results in the existence of a specific biochemical environment in the cell, meaning that there are proteins that are present or absent depending on whether or not these specific conditions have been met. When all of the conditions are right, then all of the proper proteins will be in place, and then division-specific genes will be expressed, and the cell will divide. We’ll talk a little bit about a large complex called mediator, which exists between the polymerases and the enhancer elements, and incorporates a lot of these signals. It’s also important to realize that a given regulatory factor can have a different effect based on the other proteins that are present. So if proteins A, B, and C are present, perhaps the gene would be transcribed, and if proteins D, E, and F are present then the gene would perhaps be repressed. It’s the combination of all of the proteins that

21

Created by Brett Barbaro

Multiple Transcription Factors Interact with Eukaryotic Promoters and Enhancers

• In addition to TFII, other proteins play a role in regulating the efficiency and specificity of gene transcription.

• These factors can stimulate or repress the transcription of specific genes.

• A large complex called mediator acts as a bridge between enhancer-bound activators and proteins, including the polymerase, at the promoter.

• A given regulatory factor can have different effects on transcription depending upon the nature of the other components of the regulatory complex, a phenomenon called combinatorial control.

are together at that site that decide whether or not the gene will be transcribed. So we call this combinatorial control.

21

Figure 37.8 Mediator. A large complex of protein subunits, mediator acts as a bridge between transcription factors bearing activation domains and RNA polymerase II. These interactions help recruit and stabilize RNA polymerase II near specific genes that are then transcribed.

In the diagram on the left we see the mediator complex represented by 5 blue blobs. That’s a fair representation perhaps, because the mediator complex could consist of a very wide variety of possible elements. You can see on the right there’s one arrangement that contains at least 20 different elements. And you’ll see all the different elements at the transcription start site with the RNA polymerase, and also you can see the transcription factors that are upstream of the start site. Those would be the enhancer elements. Folding over on the top, an enhancer element then interacts with the mediator, and triggers the formation of the active complex and starts RNA polymerase doing its job. An excellent video of this process is available on YouTube, and there’s a link at the bottom of the slide that I highly recommend that you check out.

22

Created by Brett Barbaro

Diagram of Mediator

http://mol-biol4masters.masters.grkraj.org/html/Gene_Expression_II5B-Mechanism_of_Transcription_files/image023.jpg

Video of gene transcription: https://www.youtube.com/watch?v=SMtWvDbfHLo

Now, you’ve probably heard of induced pluripotent stem cells. You’ve almost certainly heard of stem cells. But this is a good example of how powerful these transcription factors can be. By taking a differentiated cell, such as your skin cell or a muscle cell, you can introduce only 4 specific transcription factors - Oct4 {wrong in audio}, Sox2, cMyc, and Klf4 - and the introduction of those transcription factors will transform the cell into a pluripotent stem cell, which can be then differentiated into almost any tissue in the human body. This can be really helpful if, for example, you want to grow a new patch of skin for somebody who has terrible burns, or perhaps reconstruct an organ (this is technology which is in development right now). But mostly the benefit of having induced pluripotent stem cells is that you would be making these new tissues out of your own tissues. One of the biggest problems that exists with transplants of organs is that when you introduce a foreign organ into someone’s body, their system will {often} reject it because it’s foreign (it’s what we’re designed to do). If we are able to make the organs out of our own cells, then our bodies will not reject it. There’s no threat of rejection. Another advantage of this is that you’re not needing to use the embryonic stem cells that cause all of the uproar for the moral concerns people have about ending an embryo’s life. So induced pluripotent stem cells are a very promising medical technology that’s in development right now. And it can be activated simply by 4 transcription factors. Perhaps an important thing to note, in that little box there on the right, is that pluripotent stem cells can develop into any type of human tissue, but not into the extraembryonic tissues that are necessary for an embryo to form. So there is no ability to create a new human or clone a person off of

23

Created by Brett Barbaro

Clinical Insight: Induced Pluripotent Stem Cells Can Be Generated by Introducing Four

Transcription Factors into Differentiated Cells CLINICAL INSIGHT Induced Pluripotent Stem Cells Can Be Generated by Introducing Four Transcription Factors into Differentiated Cells

• Pluripotent stems cells can differentiate into a variety of cells upon appropriate stimulation.

• Induced pluripotent stem cells (iPS) are generated from differentiated cells by the insertion of genes for only four specific transcription factors: Oct4, Sox2, cMyc, and Klf4.

• The iPS cells are a powerful research tool and may have therapeutic value.

DID YOU KNOW? Pluripotent cells are stem cells that can develop into any type of fetal or adult cell. Totipotent stem cells not only can develop into any fetal or adult cell type but also can develop extraembryonic tissues and thus grow into an entire organism.

these pluripotent stem cells at this time. Totipotent stem cells, on the other hand, have the ability to grow into these extraembryonic tissues, and can therefore grow into an entire organism. So a fertilized egg, for example, would be considered a totipotent stem cell.

23

Now, a lot of gene expression is regulated by hormones. And we’ll talk specifically about steroid hormones. But hormones are basically molecules that are made in one part of the body and then delivered to other parts of the body, usually by the blood stream. Steroid hormones, such as estradiol, which we see on the right, have a familiar shape - they’re all made off of cholesterol - and they have the advantage of being able to diffuse through membranes. So they can enter into cells, and also into the nucleus, without having to be specifically transported in by transport proteins. This makes the steroid hormones very good for regulating gene expression, because they can get into the nucleus rather quickly. Once they’re in the nucleus, they need to interact with proteins there, and those proteins are called nuclear hormone receptor proteins. Estradiol is one of the hormones that is responsible for secondary female sex characteristics, and its receptor inside the nucleus is called the estrogen receptor.

24

Created by Brett Barbaro

Section 37.3 Gene Expression Is Regulated by Hormones

• Steroid hormones are powerful regulatory molecules that control gene expression.

• 17β-Estradiol controls the genes in the development of female secondary sex characteristics.

• 17β-Estradiol exerts its effects by forming a complex with a specific receptor protein called the estradiol receptor.

• The estradiol receptor is part of a larger class of regulatory proteins called nuclear hormone receptors, all of which are activated by binding of small molecules or ligands.

I think it’s important to realize also that these hormone receptors in the nucleus are a subclass of transcription factors. So these are a group of transcription factors that have similar characteristics. They all bind to hormones, and then they bind to the DNA. And they would bind to the DNA at enhancer sites. But for this particular class of receptors, we call those “response elements”. So the estrogen receptor binds to the estradiol response element on the DNA, but that is just an example of probably an enhancer region. A couple of other very conserved domains in these nuclear hormone receptors are: 1) the DNA binding domain - and that’s toward the center of the primary structure (remember - the primary structure is just the list of amino acids in order) - and the DNA binding domain has these zinc-finger domains, which incorporate the metal zinc, and look like fingers, and those are specifically well suited for binding DNA. We’ll go into a little more detail on those in a minute. 2) The ligand binding domain is the domain that interacts with the hormone and that is always toward the carboxyl terminus of the primary structure. And when the hormone binds, it induces a conformational change in the protein that enables it to interact with other proteins and regulate transcription.

25

Created by Brett Barbaro

Nuclear Hormone Receptors Have Similar Domain Structures

• Nuclear hormone receptors bind to specific regions of the DNA called response elements. Thus, the estrogen receptor binds to the estrogen response element (ERE).

• Nuclear hormone receptors have four highly conserved domains:

1. The DNA binding domain lies toward the center of the primary structure and is characterized by zinc-finger domains that confer specific DNA binding.

2. The ligand binding domain that contains an activation domain lies toward the carboxyl terminus of the primary structure. Ligand binding causes a structural change that enables the receptor to recruit other proteins to regulate transcription.

3. The amino terminal activation domain, which, along with the other activation domain, enables the receptor to interact with other proteins.

4. The hinge domain, which contains a nuclear localization signal.

Figure 37.9 The structure of the estradiol receptor (A) and (B). Nuclear hormone receptors contain four domains: (1) an N-terminal activation domain, (2) a DNA- binding domain, (3) a hinge domain, and (4) a ligand-binding domain that also contains a second activation domain. The structure of a dimer of the DNA-binding domain bound to DNA is shown, as is one monomer of the normally dimeric ligand- binding domain. [Drawn from 1HCQ and 1LBD.pdb.]

Here’s a diagram of the primary sequence of the protein, the estradiol receptor (that’s along the bottom). And you can see the center part, which is colored pink, is the DNA binding domain. That folds up into these structures called zinc fingers. And what you’re looking at here, on the upper left, is 2 different molecules - it’s a dimer, as these types of molecules often are, and the zinc fingers are interacting with the DNA in the major groove. On the right we see the ligand binding domain, which is toward the far end of the primary sequence at the C terminal end. And the ligand binding domain folds up into a number of α-helices. And there’s a pocket in there that fits perfectly with the ligand. In this case we’re talking about estradiol. I believe that these are both actually that protein.

26

Created by Brett Barbaro

Diagram of the Structure of Two Nuclear Hormone-Receptor Domains

Figure: You can look at the DNA-binding part of the estrogen receptor in PDB entry 1hcq . The receptor binds to DNA using two "zinc fingers." These are small domains built around zinc ions. Four cysteine amino acids (shown in yellow here) surround each zinc ion (shown in green), forming a sturdy core that gives the domain a rigid structure. The receptor places an alpha helix in the major groove of DNA--in this picture, we are looking straight down the helix. Several amino acids on one side of this helix reach out and grip the edges of the base pairs, making sure that the DNA is of the appropriate sequence. In the lower right is a space-filling representation of the estrogen receptor bound to a short piece of DNA. Flexible portions of the protein that are not included in the structures are shown schematically with dots.

So just to get a little closer look at the zinc fingers interacting with the DNA - we can see on the left the same diagram that we saw on the last slide (but it’s been turned on its side), and the corresponding diagram on the right, showing a more space-filling model, and also the specific interactions of the amino acids on the zinc finger recognition helix that interact with the DNA, which is kind of fascinating. Note there {on the right} that you are looking down the axis of the alpha helix, and you see the amino acid side chains reaching out like fingers to touch the major groove of the DNA. Zinc ions, in green on that diagram, are holding these structures in place. So this is a specific sequence of amino acids, that tends to form up into this specific structure, and interact directly with DNA and zinc ions. And this is a common motif

27

that you’ll see in a lot of these proteins. You can make small changes to the amino acid sequence in the recognition helix to change where it interacts with on the DNA, which DNA sequences it will react with in particular, and thus which genes will be affected by the attachment of these elements. On the bottom right, we see one of David Goodsell’s illustrations which includes the DNA binding domain and also the ligand binding domain on the left. Remember this is a dimeric molecule. And, interestingly, it looks like the DNA binding domain and the ligand binding domain find themselves on opposite sides of the DNA. You can see in purple where the ligands would bind on the ligand binding site.

27

Figure 37.10 Ligand binding to nuclear hormone receptor. The ligand binding causes structural alteration in the receptor so that the hormone lies completely surrounded within a pocket in the ligand-binding domain. Notice that helix 12 (shown in purple) folds into a groove on the side of the structure on ligand binding. [Drawn from 1LDB and 1ERE.pdb.]

So what happens when ligands bind to the ligand binding domain? Well, I said there was a conformational change that took place, and here it is. You have on the left an inactive form of the ligand binding domain, and the addition of estradiol into the ligand binding pocket, causes helix 12, which is illustrated in purple there on the left, to fold up into the globular region of the rest of the protein, and therefore exposes new sequences that can interact with other proteins.

28

Created by Brett Barbaro

Diagram of Ligand Binding to Nuclear Hormone Receptor

There are 2 general ways that nuclear hormone receptors can alter transcription. And that is either they change conformation when bound to the hormone, and recruit other proteins, which would be called coactivators in this case, stimulating transcription. Or sometimes, when they are not bound with a hormone, the receptors actually inhibit transcription. And when the ligand is bound, then that inhibition is relieved, and transcription can occur.

29

Created by Brett Barbaro

Nuclear Hormone Receptors Recruit Coactivators and Corepressors

• Nuclear hormone receptors can alter transcription in two general ways: 1.The complex of receptor and ligand can recruit proteins,

called coactivators, that stimulate transcription. 2.In the unbound form, some receptors bind to

corepressors and inhibit transcription. Upon ligand binding, the repression is relieved and transcription occurs. The ligand-bound form of the receptor may then bind to a coactivator.

Figure 37.11 Coactivator recruitment. The binding of ligand to a nuclear hormone receptor induces a conformational change in the ligand-binding domain. This change in conformation generates favorable sites for the binding of a coactivator.

In the case of estrogen, you see, on the upper left, there’s the unbound form, and that α-helix is sticking out. And you can see that clearly in the diagram on the lower left. When estrogen gets attached, then the α-helix that’s sticking out folds in, and coactivators are able to interact with the estradiol receptor.

30

Created by Brett Barbaro

Diagram of Coactivator Recruitment

http://www.rcsb.org/pdb/education_discussion/molecule_of_the_month/images/1hcq-1a52.gif

Now these steroid-hormone receptors, because they are so important in regulating genetic expression, are very good targets for drugs. So there are drugs that activate a nuclear hormone receptor, those are called agonists, and then there are drugs that inactivate or shut-down hormone receptors, and those are called antagonists. Both of these would probably mimic the original structure of the hormone to a large extent, and probably bind in the active site. An agonist would then cause the behavior, the conformational change that the hormone would cause, whereas an antagonist would prevent that change from taking place. 2 examples we see below of antagonists are tamoxifen and raloxifene. Those are similar enough in structure that they will bind in the active site, but will prevent the estradiol receptor from changing its conformation, and binding to coactivators. Some cancers, like breast cancer for example, are dependent on the estradiol receptor complex. So using these 2 drugs can shut down the estrogen receptors, and therefore be used as treatment for these cancers.

31

Created by Brett Barbaro

Clinical Insight: Steroid-Hormone Receptors Are Targets for Drugs

CLINICAL INSIGHT Steroid-Hormone Receptors Are Targets for Drugs

• Ligands that activate a nuclear hormone receptor are called agonists, whereas ligands that inhibit the receptor are called antagonists.

• Some cancers are dependent on the action of the estradiol-receptor complex. The growth of these cancers can be slowed by administering receptor antagonists, such as tamoxifen and raloxifene.

32

Here we can see how the ligand binding of estradiol to the receptor induces its change in conformation. And the diagrams at the bottom show 1) the proper forming of the estradiol-estrogen-receptor complex. 2) Tamoxifen, however, when it gets inside the active site, prevents the helix 12 from folding in, leaving it sticking out so that {the receptor} can’t interact with its coactivators.

So now we’re going to talk about another type of activation that occurs, and that is through histone acetylation. Histones, as you recall, are the proteins that the DNA is wound around. And when the DNA is wound around these histones, it is not accessible – or less accessible, I mean, I guess the outside of it is accessible, but the inside isn’t. In any case, before you transcribe these DNA, you’re probably going to need to get rid of the histones - and one way of doing that is by adding acetyl groups to them. Acetyl groups have negative charges, and the DNA has a negative charge, so if you add acetyl groups to your histones, then it’ll start to repel the DNA, and loosen its {interaction}, and then the DNA will be exposed and ready for transcription. This histone acetylation occurs primarily on lysine residues, and if you’ll recall, lysine residues are positively charged, and therefore would interact strongly with the negatively charged backbone of DNA. So by acetylating them you are destroying the interaction and freeing the DNA from the histone. Acetylation is probably the most common and well-studied mechanism of modifying histones, but histones can be modified in a LOT of other ways. Two of these ways are methylation and phosphorylation, but we’ll discuss those in a bit.

33

Created by Brett Barbaro

Section 37.4 Histone Acetylation Results in Chromatin Remodeling

• Coactivators can stimulate transcription by loosening the interaction between histones and DNA, making the DNA more accessible to the transcription machinery.

• A common means of weakening the interaction of the histones with the DNA is by acetylation of the histones on specific lysine residues.

• Histones can also be modified by other means such as methylation and phosphorylation.

So where do these acetyl groups come from? They come from our good old friend, acetyl CoA. So there we go - a product of metabolism can interact directly with DNA and alter the expression of certain genes. Though to be fair, this acetyl CoA is not the same acetyl CoA that’s created in the mitochondria. This is created in the nucleus by ATP-citrate lyase. But I imagine it is dependent to some extent on glycolysis. Acetyl CoA is a very handy way of passing around acetyl groups, since it works in the metabolism cycle, and they use the same mechanism in the nucleus. Remember the coenzyme is just a handle for the acetyl group. In the nucleus this handle is recognized by proteins called histone acetyltransferases (or HATs). And then these histone acetyltransferases can transfer acetyl groups to histones. So you would have your nuclear hormone receptor bind to the DNA, and then your histone acetyltransferase will bind to the hormone receptor as a coactivator. And then it will start acetylating the histones around the site. The acetylation of histones releases DNA and also creates docking sites for other components, such as chromatin remodeling machines.

34

Created by Brett Barbaro

Metabolism in Context: Acetyl CoA Plays a Key Role in the Regulation of Transcription

• ATP-citrate lyase located in the nucleus generates acetyl CoA that is used by histone acetyltransferases (HATS) to modify histones.

• HATs are components of coactivators or are recruited by coactivators.

• Acetylation reduces the affinity of histones for the DNA and generates a docking site for other components of the transcription machinery, such as chromatin-remodeling engines, ATP-powered complexes that make DNA in chromatin more accessible.

Figure 37.12 The structure of histone acetyltransferase. The amino-terminal tail of histone H3 extends into a pocket in which a lysine side chain can accept an acetyl group from acetyl CoA bound in an adjacent site. [Drawn from 1QSN.pdb.]

Here’s a diagram of a histone acetyltransferase interacting with the tail of the H3 histone and with the coenzyme A, bringing them together so that the acetyl group can be transferred from coenzyme A onto the lysine of the histone tail.

35

Created by Brett Barbaro

Diagram of the Structure of Histone Acetyltransferase

Figure 37.13 Chromatin remodeling. Eukaryotic gene regulation begins with an activated transcription factor bound to a specific site on DNA. One scheme for the initiation of transcription by RNA polymerase II requires five steps: (1) recruitment of a coactivator, (2) acetylation of lysine residues in the histone tails, (3) binding of a remodeling-engine complex to the acetylated lysine residues, (4) ATP-dependent remodeling of the chromatin structure to expose a binding site for RNA polymerase or for other factors, and (5) recruitment of RNA polymerase II. Only two subunits are shown for each complex, although the actual complexes are much larger.

So here we have an example of how this chromatin remodeling might take place. First, there is the binding of a transcription factor - and you can see that as it binds here in the space between 2 histones, in the “string” portion of the beads on a string, so to speak. When the transcription factor is bound and activated, such as with estrogen, then it’s able to bind to coactivators. In this case the coactivator is a histone acetyltransferase. So the coactivator then starts acetylating the histones right next to the transcription factor, and that helps to unwind the DNA, and also to recruit the remodeling engine. Once the DNA is unwound a little bit, it has exposed a new site that can be bound by polymerase II, and then transcription of the gene can start.

36

Created by Brett Barbaro

Diagram of Chromatin Remodeling

I mentioned methylation and phosphorylation of histones - there are some examples here. Histone 4 can be acetylated on lysine 8 (that’s what the K8 stands for). And then histone 3 can be acetylated on K14. Histone 3 can be methylated on K27 or R17 (R is arginine). All of these modifications activate transcription in the nearby area. But there can be other modifications that have different effects, such as the phosphorylation of histone H2B on serine 14 - that recruits proteins involved in cell death or apoptosis.

37

Created by Brett Barbaro

Table 37.3 Selected Histone Modifications

Figure: Histone proteins are decorated by a variety of protein posttranslational modifications called histone marks that modulate chromatin structure and function, contributing to the cellular gene expression program. This SnapShot summarizes the reported human, mouse, and rat histone marks, including recently identified lysine acylation marks.

It turns out that there are a whole lot of modifications that can take place on histones. And the various modifications that exist on histones have various effects. And this is now known as the histone code. This is an active area of investigation, and I’m not going to hold you responsible for any of this, but I thought it would be good for you to know how many of these modifications can take place. So we see here 5 different histones, H3, H4, H2A, H2B, and H1. In the center of each one of these histones you see a boxed area, and that is the area that is the globular area in the center. And then the rest of the histone are the tails that are exposed to the outside. And, as you can see from this diagram, it looks like almost half of the residues in these histones are modifiable. Not only by acetyl groups or by methyl groups or phosphoryl groups, but by ubiquitin, formyl groups, hydroxyl groups, even ADP. All of these possibilities correspond to an incredible diversity of signals that can be present on histones and have very specific and exquisite regulatory effects on the genes that are associated with them.

38

Most of these, I think all of these, reactions, though, are reversible, so if you have modified the histone in one way then it can be changed back. And an example of that is histone deacetylases {HDACs}. Those are another very important class of proteins which are being used for drugs right now. And those can be used to turn off genes, by removing acetyl groups from histones, much in the same way that histone acetyltransferases activate genes by attaching acetyl groups to them.

39

Created by Brett Barbaro

Histone Deacetylases Contribute to Transcriptional Repression

• The acetylation of histones is not an irreversible reaction. Genes may need to be expressed at certain times and then be repressed.

• Histone deacetylases catalyze the removal of acetyl groups from histones, resulting in the inhibition of transcription.

• All covalent modifications of histone tails are reversible.