|
Molecular
Biology Protocols |
|
10. Analysis of DNA Sequences
Editing
with Chromas
The sequence files that
you will receive from the
The ABI Prism automated
sequencer does its best to identify evenly-spaced peaks in the electrophoretic
pattern and where the pattern is clean the job is straightforward:

However, at the
beginning and end of the electrophoresis, the pattern of peaks usually becomes
difficult to interpret and these sections must be clipped away:
Chromas
will allow you to do just that by selecting a “left cutoff” and
“right cutoff”, as well as assign identifications to peaks that the
ABI software cannot identify. Make any
modifications of this sort in lower case, so that you can recognize such
editorial changes later. You may
“delete cutoff sequences” before you save the resulting file.
Once you are satisfied with the validity of the
sequence, copy the sequence in FASTA format
and paste it into any other software package, including Word. You will see that it is simply a sequence of
nucleotides.
Finding an Open
Online at http://www.ncbi.nlm.nih.gov/gorf/gorf.html
you may copy and paste your FASTA sequence to identify potential open reading
frames (ORFs) that may code for a protein. If your sequence is an accurate one and if
it codes for a protein, you should see one long region in a particular reading
frame. The other reading frames are
unlikely to encode amino acid sequence because of frequent stop codons.
Alternatively, you can use EditSeq
or SeqBuilder, components of Lasergene 6, to search for open reading frames.
BLASTing
GenBank
Using the same FASTA sequence, you can search
GenBank for any similar sequences using BLAST at http://www.ncbi.nlm.nih.gov/BLAST/. There are several versions of the BLAST
search – the most useful may be BLASTX, where the nucleotide sequence is
translated using all possible reading frames and the resulting putative amino
acid sequences are compared against everything in the database. This is the point at which you may be able to
proclaim a likely identification of your sequence based on evolutionary
relationships to sequences of other species that may already be in
GenBank. BLASTX will also indicate the
orientation of the reading frame.
Alternatively,
you can use EditSeq or SeqBuilder,
components of Lasergene 6, to automate BLAST
searches.
Translation
Once
you are confident of the proper reading frame of a cDNA sequence, you can
obtain a translation of the appropriate region with the Translation Machine at http://www2.ebi.ac.uk/translate/.
Sequence Assembly
If
you have two or more sequences of the same amplification product, you can
assemble these sequences using SeqMan, a
component of Lasergene 6. SeqMan will also
automate the trimming of the original chromatograms, although you may wish to
examine the results carefully before accepting them without question. Since SeqMan is not
programmed with information relating to whether a particular sequence was generated
with a forward or with a reverse primer, you should look at the resulting
assembly to see if it makes sense with regard to open reading frame and
orientation. You may request a reverse
complement to correct the assembly if necessary.
Multiple
Alignment
As
you find related sequences through BLAST, you may wish to generate a multiple
alignment to compare similarities with your own sequence. Refer back to Protocol 1 for instructions!
Amino
Acid Sequence Analysis
Regions
of α-helix or transmembrane domains in your newly translated protein can
be predicted using a variety of tools available at http://iubio.bio.indiana.edu/soft/molbio/ibmpc/antheprot-readme.html. Alternatively, you may use Protean, a
component of the Lasergene 6 package.