Clinical genetics assignment

profile[email protected]
Module10TranscriptClinicalGenetics.docx

Module 10A Transcript DNA sequencing

1. Why do we sequence DNA? I bet you can think of a few reasons.

a. Did you think of detecting mutations in tumors? Mutations in fetuses? Ancestry DNA? Forensics?

2. The process of DNA sequencing is very similar to synthesizing DNA in vitro, (like PCR, but not a chain reaction). Let’s remember how to synthesize DNA in vitro. Write it down before I blab on about it.

a. The components for PCR are DNA template, DNA pol, dNTP, primers, buffer, water, Mg.

b. For sequencing, they are DNA template, DNA pol, dNTP, primer (just one), buffer, water, Mg, and fluorescently labeled terminator dNTP.

3. Let’s watch this video, and then chat. I’ll do the chatting, but you can talk to yourself. https://www.yourgenome.org/video/dna-sequencing

4. Questions about the video:

a. In the video, we see that the genomic DNA is cut into smaller fragments and inserted into a plasmid. What do you think is used to cut it with?

b. How does the scientist know what primer sequence to use?

i. If she is looking for a specific sequence (to analyze a mutation in a specific gene), she can use a specific primer that is upstream of the mutation.

ii. If she is sequencing the genome, like the video, she didn’t need to know each sequence. The sequencing primer binds to a sequence in the plasmid (brilliant! So it doesn’t matter what the sequence in the insert is, the primer binds to plasmid).

c. One primer, one direction. All of the amplicons start at the same 5’ location, but they end at different places, to include every possible position on the strand.

d. How do those terminators work?

5. Remember that DNA pol always synthesizes DNA in the 5’ to 3’ direction, and needs the 3’OH. What happens if there is no 3’OH? It stops. Let’s go back to looking at sugars. Here’s deoxyribose (note where the 3’ C is and the 3’OH). The sugar on the right is 2,3-dideoxyribose.. It has no 3’OH. A nucleotide with dideoxyribose will stop the elongation, since DNA pol can’t add anything.

a. The ddNTPs are labeled with specific fluorescent colors. You don’t have to memorize the color of each one. That can change. Draw a color on this one, I will too.

b. The dNTPs are not labeled.

6. The production of the amplicon starts at the 5’end, and terminates randomly when a ddNTP is added to match the template. The procedure has enough materials and runs for a period of time that should allow for every possible fragment to be made. There are about 100 times more unlabeled dNTPs than labeled ddNTP. Your chapter uses this terminology, the ratio of dXTP:ddXTP is approximately 100:1. X indicates any of the 4 bases. When we use NTP we mean all of the bases.

7. The process uses electrophoresis, but a different type. The matrix background is a polymer that is liquid, and flows past the laser detector in a thin tube called a capillary tube

8. Let’s watch the video again. It should be looking familiar.

9. Now let’s break down the parts of figure 14.17 sequencing the DNA. Step 1

10. Step 2

11. Step 3

12. Step 4

13. In the final panel of that figure, we see the DNA electropherogram. The peaks are the bands of electrophoresis and they are only different in size by 1 nucleotide, so they are close together. The color is the fluorescent dye on each terminating ddNTP. The first peak is red, it is thymine. Look above the peak, the electropherogram has interpreted it for you. The sequence is written in FASTA. Notice that all the T’s are red, all the C are blue, etc.

14. A couple details: sequencing is not great at the beginning of the sequence. The very short segments. And sequencing is limited in length, so very long sequences are not good. So you can’t get an entire chromosome, and probably not an entire gene.

15. When you evaluate the DNA electropherogram, you can see that some peaks are higher than others. For example in this figure the 4th nucleotide is C. The height measures how much fluorescence was detected. Look at the C in position 11. It’s not that tall. This means that there were more DNA amplicons made (randomly) that stopped after TCAC than stopped at TCACAGTTCTC. Let me summarize by saying that the height is insignificant to the sequence.

a. If I gave you an electropherogram and asked you to read the sequence in FASTA, just go to the top and read the letters.

16. A result might look like this: GATTACAGATTANGTT. The N indicates that the peak was not robust enough to be classified. It is kept as a space holder, something is there, but it could not be detected.

a. In the chapter on genomics, we get to see how the relatively short sequences that are created by this method are aligned to get a more complete sequence.

17. BTW DNA sequencing was invented by Frederick Sanger in 1977. Automated sequencing made it much more user friendly, about 20 years later.

18. There are other methods for sequencing DNA. In Molecular Diagnostics class you will learn about those.

Module 10B Construction & Screening of DNA libraries

1. Now you’ve got the basis to understand what DNA libraries are, and what they are used for. In the video we watched on DNA sequencing, we saw that DNA is cut with Restriction enzymes, and ligated into vectors. The vectors (usually plasmid-like) are grown in E. coli. An individual bacterium will make a colony of the same DNA sequence.

2. A set of DNA clones containing the entire genome of an organism is called a library. Some libraries are even chromosome specific, which is helpful in identifying a gene that is known to be on a specific chromosome.

a. Another type of library is called a cDNA library. The c stands for complementary. A cDNA is ds DNA that has been made from mRNA. It’s a genetic trick that takes advantage of an enzyme found in some viruses called reverse transcriptase. Here’s the backstory. The genetic material of some viruses is RNA. After invading their host, they use the enzyme reverse transcriptase to copy the RNA into DNA. Of course reverse transcriptase (aka RT) has been cloned and mass produced, and we can buy it in a cute little microcentrifuge tube.

3. Let’s work through this figure to understand how cDNA’s are made. The “template” is mature mRNA. Think back to mRNA processing. What happens during processing? 3 important features to note that make the mRNA attractive is

a. 1) The introns have already been spliced out. This makes the insert smaller. If it is used to produce a recombinant protein, it does not need to be spliced later.

b. 2) MRNA have a polyA tail. A complementary primer of poly T will hybridize to it. This is referred to as an oligo (dT), in this figure there is an n to indicate that the length can be varied.

c. 3)Most genomic DNA does not code for genes, but mRNA does. In our search to identify genes, we can eliminate DNA that is not of interest.

d. The enzyme reverse transcriptase will extend the oligo-dT primer, at the 3’OH, to make a double strand. Since it has 2 different molecules, RNA and DNA, it is called a hybrid.

e. The next steps destroy the RNA of the hybrid with an enzyme called RNase H. This is very specific to destroying RNA that is hybridized to DNA. A step we don’t see here is that the resulting product is single stranded DNA, with some remaining RNA that can be used as a primer for DNA pol I to make double stranded DNA. Ligase will repair nicks after the RNA primer has been removed.

f. So many reasons to create and use a cDNA library. Here’s another advantage that is also a disadvantage. We’ll use your lifespan. We can create a cDNA library from your liver cells, yours, as an adult. That library is organ specific and temporally specific. A cDNA library from a fetal liver will express different genes. A cDNA library from a liver that has cancer will express genes that are turned on by the tumor. Those are advantages. The disadvantage is that not all genes will be represented, only those that were turned on when the tissue was collected.

4. So now you’ve got a pot of cDNA’s (really a cute microcentrifuge tube of them). Cut them with restriction enzyme, and insert into a vector. This figure should look familiar from last week’s lecture. Both gDNA restriction fragments and cDNA restriction fragments can be inserted into vectors to clone the DNA sequence. See the sticky ends where the overhands are?

a. Did you write a lot of notes? Be sure you can answer those questions about cDNA, and cDNA libraries. In this next section we’ll talk about how you can find the book in the library– the clone that you want to work with.

5. Screening DNA libraries for genes of interest

a. The libraries have a million or more different sequences. How can the scientist find the one they want? Two strategies we’ll delve into are genetic selection, and molecular hybridization.

b. Think back to complementation analysis. An organism with a mutation can be rescued (cured) by introducing the wild type gene.

6. Genetic selection tests for the ability of a clone to rescue a mutation

a. It uses complementation analysis , and asks “does the clone rescue the mutant phenotype?” The scientist wants to find the gene that confers penicillin resistance in Salmonella.

b. A DNA library is constructed from a strain of Salmonella that is resistant to penicillin, the gene of interest is among many other clones. Genes are inserted into plasmids.

c. E. coli that is sensitive to penicillin is transformed the library of plasmid clones.

d. The E. coli is plated to medium that has penicillin. The only E. coli that can grow on that plate has the plasmid with the gene of interest. Grow more of that colony.

e. In this example, the gene comes from a bacteria, and the rescue is in a bacteria. A limitation of this procedure is eukaryote gDNA and gene expression are different than prokaryote, so Saccharomyces cerevisiae is often used for eukaryotic expression.

7. Molecular hybridization is a good gimmick. You need to know some information about the gene you are looking for to make a probe. The example given in the book is to look for the DNA of the gene that codes for beta globin of hemoglobin. Young red blood cells called reticulocytes have a lot of mRNA that codes for beta globin. It’s like Geico, it’s what they do.

a. Imagine in this slide, that we are looking for the transformed E. coli colony that has the DNA from the beta globin gene. Which one is it? Make a probe that is complementary to the gene.

b. I will explain how to make the probe from that mRNA to probe for the beta globin gene. Recall that mRNA has a poly A tail, so a primer with oligo dT can hybridize to it

c. Recall that reverse transcriptase can convert RNA into DNA, making a cDNA

d. Recall that the backbone of DNA has P in it, in the nucleotides

e. In the reagents for making the cDNA, one of the nucleotides is radioactive, e.g. 32P. This makes the probe radioactive.

f. So now we have the probe, and we have the library plated. A technique called replica plating makes an imprint of the bacteria in their same orientation on a piece of paper. It has to be waterproof, so we’ll use a nylon membrane instead. Looks like paper. Keep the original plate, and the membrane. Think of the orientation of the plate and membrane in terms of the face of a clock.

g. The probe is hybridized to the prepared membrane, and the membrane is exposed to film to make an autorad. The spot that is radioactive is the colony we want, it has the DNA of interest, and we can now make a lot of it.

8. Let’s talk through the entire figure.

9. We use the term hybridization when we talk about probes matching to their complementary DNA. Why are they hybrids? They are 2 different things – DNA and cDNA from a different source. When RNA and DNA form a double strand, it is a hybrid.

a. To form a double stranded nucleic acid with two single strands of DNA or RNA by allowing the base pairs of the separate strands to form complementary bonds

b. Examples

i. DNA-RNA

ii. DNA probes on denatured DNA

iii. cDNA and DNA

c. Hybrids are held together by Hydrogen bonds

10. A Southern Blot uses probes and hybridization to detect DNA, so let’s switch to that technique. A southern blot …

a. Uses labelled probes to identify DNA of interest that has been run on gel electrophoresis

b. DNA from the gel is transferred to a nylon membrane paper

c. Membrane is hybridized with the probe, then exposed to Xray film

d. The autorad is radioactive where the DNA of interest is

11. This figure talks through the entire process.

12. Southern Blot

a. Technique that transfers DNA molecules that have been separated by gel electrophoresis onto nitrocellulose or nylon membranes

b. It is used to identify DNA of interest by hybridizing it with labelled probes

c. Invented by E.M. Southern

d. Southerns are used to identify DNA. They were the standard method for paternity testing for a long time.

e. What if you want to identify RNA?

13. Northern Blot

a. Analogous study looking at RNA transcripts

b. Compares the expression of genes in tissues, in time, in species, in tumors

c. Due to the secondary structure of RNA, a special denaturing step is added

d. RNA are isolated, denatured and electrophoresed

e. Blotted to nitrocellulose

f. Probed

g. autorad

14. Analysis of RNA by reverse transcriptase can be used for medical diagnosis

a. RT-PCR is used to create cDNA – we’ve talked about this already

b. It can also be used to investigate the presence of retroviruses, and to quantitate mRNA transcription

c. It is the basis of the method we use to identify gonorrhea – by using RT-PCR on bacterial rRNA

d. It can be used to find mRNA transcripts of translocations, like BCR-ABL the translocation that causes chronic myeloid leukemia

15. Let’s review the process again. Read the slide and follow along. This is Figure 14.4. I’ll add that a probe at the end will verify what RNA was amplified.

a. BTW, note this: PCR requires DNA as the template.

16. A western blot is another molecular tool, but it looks at proteins, not nucleic acids

a. Electrophoresis of protein from tissue

b. Blot/transfer to nitrocellulose membrane

c. Probe with a labeled antibody to the protein of interest

d. It can be used in 2 ways

i. To detect proteins: Use a reagent antibody to find out if your favorite protein has been expressed

ii. To detect antibodies: It’s a common strategy in serology to find out if a patient has been exposed to a pathogen to see if they produced antibodies. For example, in Lyme disease. Can we find antibodies to Lyme disease? The electrophoresis is proteins from Borelli burgdorferi, the agent of Lyme disease. The antibody is in the patient’s serum.

17. This is the last topic for this chapter. We love puzzles, right? I’ve used this figure before – it indicates the restriction map of a plasmid. Spend 30 seconds orienting yourself to it. This map was put together using a series of restriction enzymes alone and in pairs to deduce where the cuts were at.

a. How did they figure out where the pieces fit relative to each other?

18. Let’s use this figure from your book. This is an investigation of a DNA fragment.

a. Lane 1 is the MW ladder, with the sizes marked. We buy it.

b. Lane a is the DNA sequence cut wit Not. Look at the left panel a to match it. There are no cuts. The length of the entire piece is 600. Beware, this is not a circular plasmid, it is one long piece of DNA. There are no sites for Not.

c. Look at lane b. what sizes are the products? Match them to the ladder. Now look at the left panel. Which is it, is it the top of the bottom?

d. Look at lane c for the HindIII cuts. What sizes? What’s the orientation, top or bottom?

e. The final lane has a double digest. Which bands are new and what bands did they come from? Highlight them. We have a new 3000 size.

f. We kept the 1000 and 2000 pieces, so they can’t overlap. Let’s circle them in the figures on the left panel. Between them, the distance is 3000, so now we know the orientation. 1000 for HindIII 3000 and 2000 for E coli.

19. I want to write some rules for solving this. So I searched the internet for a great way. After watching the video by Shomu, here’s the rules for this simple problem.

a. Identify the bands in double digest as A = 3000, B = 2000, C = 1000

b. Investigate the single digest (EcoRI). In a linear molecule, 1 cut makes 2 fragments. The sizes are 2000 = B and A+C. since there are 2 pieces, there is only 1 cut

c. Investigate the HindIII digest. 2 pieces, so 1 cut. The sizes are 1000 = C and 5000 = A + B

d. Use the letters from each single digest to create an overlap. Set up the letters to cause an overlap, like this

B + A

A + C

So the sequence is B A C. We know the eco produces a B of 2000, so that’s where the Eco site is. The other cut site is the Hind.

20. So let’s watch Shomu. https://www.youtube.com/watch?v=GgaMq69zZ2M Shomu’s restriction map problem 16 minutes

21. This homework problem is not that hard if you use Shomu’s method. Try it now. If you can’t solve it, watch his video again, and try it. It will take about 10-15 minutes the first time, but once you have the steps, it will be much quicker, and you’ll be more confident

22. Here’s the other homework problem.

3