Essential techniques of molecular genetics

Reminder: an excellent (though slightly out of date) web site to visit for relevant information is The US Department of Energy's Primer of Molecular Genetics. Another really well thought out and presented site is the MIT Biology Hypertextbook, and in particular, the molecular genetics chapter.

Electrophoresis

sieving properties of agarose and polyacrylamide
Matrix
optimum size range of the molecules to be separated
Agarose 0.5 - 20 kb
Polyacrylamide 10 - 1000 bp (DNA),
0 - 100 kd (proteins)
Electrophoresis is one of the most convenient methods to separate molecules which differ in any combination of size or charge. There are many forms of electrophoresis in use but all operate by putting a solution of the molecules to be separated into an electric field. Depending on their charge the molecules will be attracted towards one or other of the electrodes. In order to prevent convection effects from disrupting the separation the solution is supported by an immovable matrix. This may do little more than prevent convection currents (which is the case with starch gel electrophoresis) or it may, by presenting a barrier to the easy passage of the moving molecules, play an active role in sieving the molecules and thus contribute to their separation on the basis of size. In all electrophoresis experiments electrodes are positioned at opposite ends of the separation matrix which may be in the form of a tube or of a thick or thin slab. In recent years the most common forms of electrophoresis have used either agarose or polyacrylamide matrices. There are many different commercial and home made apparatuses in use. For unimportant technical reasons, acrylamide gels are usually held vertically and the samples are loaded in wells formed in the top edge whereas agarose gels are usually held horizontally and the samples are inserted into wells in their top face.

You will find many variations on this basic theme in the course of this lecture series.

Cloning

This is the general term given to the isolation of a specific fragment of target (e.g. human) DNA by propagating that fragment in a microorganism (usually a bacterium but sometimes a yeast). Using standard microbiological techniques, a single microorganism can be isolated and grown in culture carrying within its genome one small fragment of human DNA. The human DNA can readily be purified from the bacterium in gram quantities if necessary.

Almost all clones come from libraries. Essentially these may be only one or other of cDNA or genomic and it is vital to remember the distinction between them.

cDNA clones

Remember that cDNA means complementary DNA. It has been copied directly from mRNA. Because that RNA will have been processed in its journey from gene to cytoplasm (to test tube) it will not contain introns. Nor will it contain any sequences from upstream or downstream of the genes. Every cDNA library is made from a tissue source. It will only contain a representation of the sequences transcribed in that tissue. e.g. do not expect to isolate beta globin cDNA from a skin fibroblast cDNA library. Even though cDNA is made from mRNA it still contains repetitive sequences, 5%-10% of human transcripts contain an interspersed repetitive element such as an Alu repeat in the 3´UTR.

Most libraries contain cDNA clones in the same relative abundance as were their corresponding mRNAs in the tissue of origin. e.g. liver cDNA libraries are a rich source of serum albumin cDNA clones. Some libraries have been normalised, i.e. an attempt has been made to equalise the frequencies with which all types of cDNA are found within the library (if they were to be found at all in the mRNA source).

cDNA clones may be made for various purposes

From the point of view of mapping and sequencing the genome, the latter two classes of clone are most relevant.

Genomic clones

Genomic clones are designed to include as much genomic DNA as possible in order to minimise the number of clones required to be isolated. Over the years vector systems have evolved. The first generation of genomic libraries were built in vectors based on lambda, later libraries used plasmid-phage hybrid vectors such as cosmids. Recently yeast artificial chromosomes (YACs) have been popular but are now gradually being replaced by bacterial systems based on either the phage P1 or the F element origins of replication (PACs and BACs).


Cloning Vectors
Vector Maximum Insert size Approx. No. of clones required in library Advantages? Disadvantages?
lambda 20 kb 5 x 105 easy to construct libraries,
relatively stable inserts
many clones required
hard to prepare DNA from clones
cosmid 45 kb 2 x 105 easy to construct libraries
easy to prepare DNA from clones
not always stable
YAC 1 Mb 104 few clones required very prone to rearrangement,
difficult to construct
PAC ~120 kb 105 fewer clones required than for cosmids
stable
single copy origin of replication therefore harder to prepare DNA
BAC > 500 kb 5 x 104 few clones required
very stable
single copy origin of replication therefore harder to prepare DNA

Another innovation has been the use of gridded and chromosome specific libraries. In a gridded library every clone has its own unique address where it is to be found in a well in a microtitre tray. This has huge advantages over ungridded, amplified libraries for our ability to exchange information about clones. Chromosome specific libraries have been made by flow sorting individual metaphase chromosomes using a machine originally designed to sort different populations of cells.

Southern and Northern Blotting

Before PCR and cheap fast sequencing changed our view of the universe that is genetics, the Southern Blot was a universal workhorse. There was not an experiment in molecular genetics which did not at some stage employ a Southern Blot. It is still a useful tool and you need to know about it so that you can interpret historical data.

Sickled and normal erythrocytes

Named after its inventor, Prof. Ed. Southern, the blot is a fast way of analysing a small number of DNA fragments which may be present in a complex mixture. For instance, the sickle cell mutation is a point mutation in the beta globin gene which changes a single amino acid in the beta globin polypeptide. In homozygotes, in conditions of low oxygen concentration the mutant globin polymerises forcing the cell into a bizzarre shape.

Suppose that we wish to ask whether the sickle cell mutation is present in an individual and the only material which we have available is a DNA sample. We could employ a Southern Blot:

Diagram of a Southern blot

In the case of the sickle cell mutation, the single base change involved, as well as causing a missense mutation in the beta globin gene, also causes the disappearance of a restriction site for the enzyme MstII. As a consequence the size of the restriction fragment containing the 5´ end of the gene is altered from 1.15kb to 1.35kb. See the figure below where MstII sites are shown as arrows.

The Polymerase Chain Reaction

This invention has revolutionised molecular genetics by doing away with the need to clone DNA in many circumstances where it used to be necessary. It is so poweful that it has made it possible to produce microgram amounts (that's a lot!) of DNA starting from just a single molecule. This has applications in forensic science, in archeology (Neandertal mitochondrial DNA amplified from ancient bones by PCR was recently sequenced), and in medicine where, for example, it can enable antenatal DNA tests to be performed in just a few hours work and large populations can be screened for particular mutations very quickly and cheaply.

Diagram of two rounds of a PCR amplification

Normally about 30 cycles of amplification are carried out. If the efficiency of the reaction were 100% that would represent approximately a 109 fold amplification. In fact, although early rounds are highly efficient the later rounds of amplification are much less so because of depletion of nucleotide triphosphates and primers in the reaction and gradual destruction of the DNA polymerase.

DNA sequencing

At the heart of the so called "new genetics" is our ability to sequence DNA rapidly and cheaply. With knowledge of sequence comes knowledge of gene structure and very often a beginning of understanding of gene function. The simplest method of sequencing DNA was invented by Dr. Fred Sanger for which he was awarded his second Nobel Prize. In the UK our major national Human Genome Project DNA sequencing centre is named after him, the Sanger Centre in Cambridgeshire. You can visit its home page here if you wish.

The Sanger method is also known as the "dideoxy chain termination method". Diagram of dideoxy-sequencing

STSs and ESTs

PCR and sequencing together have made possible the creation of useful landmarks in the genome. These are several thousand short fragments of known DNA sequence whose presence in any DNA sample can be tested by PCR. They are known as STSs (Sequence Tagged Sites). If an STS is part of a transcribed sequence it is known as an EST (Expressed Sequence Tag). Hundreds of thousands of ESTs have been created and can be accessed by computer. One major attempt to classify them all is known as Unigene

Recommended reading

The topics include:

Reading:


Self Assessment Questions

  1. Dystrophin is the protein product of a gene called DMD on the human X chromosome. Mutations in this gene cause the muscle wasting disease Duchenne muscular dystrophy. If, as part of a study to compare human and kangaroo muscles, you wanted to compare the sequences of the human and kangaroo dystrophins, which of the following resources might provide useful material and how?
    1. A cDNA library made from kangaroo brain
    2. A cDNA library made from the muscle of a patient with Duchenne muscular dystrophy.
    3. A cDNA library made from normal human muscle.
    4. A human genomic library in a BAC vector made from leukocyte DNA
    5. An arrayed genomic library in a YAC vector made from kangaroo liver DNA
    6. RNA purified from kangaroo muscle.

Answers


Back to the top
Back to the lecture list
Next lecture