Setting
the optimality criterion to distance
1 Set the optimality criterion
3 Build a neighbor joining tree
Setting
the optimality criterion to likelihood..
1 Set the optimality criterion
3 Set likelihood model parameters
Submitting
commands in batch file
Submitting
commands in batch file
The following hands-on tutorial provides a very brief overview of the basic usage of PAUP* 4.0. The tutorial will take you step-by-step through an analysis of one of the sample data files included on your distribution disk and also available on the world wide web at http://paup.csit.fsu.edu/data/primate-mtDNA-interleaved.nex . This tutorial was designed for people with no prior experience using PAUP*. If you are already familiar with PAUP* then you will probably wish to skip this tutorial. We assume that users are familiar with basic phylogenetic terminology and operating system specific issues. As you become more experienced using PAUP* 4.0, you will discover that there are many alternative ways to execute the operations described below. For obvious reasons, we have chosen not to describe all the possibilities in this tutorial; however, we encourage you to explore other menu and command-line options as your time permits.
The Windows interface is almost entirely command-line driven. Some menu functions are available in the Windows interface; however, these functions mostly include file and edit operations. This tutorial will use both menu options and command-line syntax to demonstrate the different environments under which PAUP* may be run.
Throughout this tutorial we follow several typographical conventions. First, menus, menu items, and items contained in dialog boxes or elsewhere on the screen are given in a bold san serif font. For example, the text File > Open means click "File" from the main menu and then select "Open" from the menu items under "File." Second, text that is intended to be typed by the user at the command-line prompt or into a dialog box is given in a plain fixed-width font. For example, the instructions "Type: weights 2:1stpos" mean that everything after "Type:" should be entered exactly as it appears. Finally, interface specific instructions are offset and bulleted, whereas all other text pertains to all of the PAUP* interfaces.


In the editor, scroll through the sample file. Notice that the file is divided into blocks of text, delimited by the words "begin" and "end". The word following "begin" defines the block-type. In this example, the following types of blocks are used: taxa, characters, assumptions, and paup. There are, however, numerous other NEXUS block-types. In fact, one of the advantages of the NEXUS format is that applications will simply skip over blocks that they do not recognize. For a more detailed discussion of the NEXUS format see Maddison, et al. (1997). For this example, you will not need to modify the original sample file.
Close the sample file and do the following:
After executing the sample file, PAUP* will display comments and some general information about the data. For this example, the source of the data set is given, followed by a section reporting the dimensions of the data matrix, the type of data, etc. As of yet, no analyses have been conducted; PAUP* has simply processed the data and is now waiting to be told what to do next.
Ordinarily, you will want to log the results of a PAUP* session to a disk file to have a record of the results of your analyses.

Logging can be started and stopped anytime during your PAUP* session. To stop logging do the following:
Now that the data matrix has been processed, you can use PAUP* to obtain basic summary information about the data set. To start, you will display information about the characters included in sample data set.

PAUP* will display a summary of the current character status (i.e., types, weights, etc.). Remember, if logging was turned on, the summary information displayed to your screen will also be saved to the log file. You may also choose to display a summary of the taxa (tstatus), the entire data matrix (showmatrix), and more.
PAUP* provides several ways to restrict analyses to a subset of the taxa and characters included in a data matrix. For example, the sample data set includes protein coding and non-coding regions of primate mitochondrial DNA. Suppose we wish to analyze only the coding regions of the data. The characters belonging to these regions have already been identified in the sample file using the charsets command. Character sets simplify certain procedures by allowing you to refer to a group of characters by a single name. You will start by excluding all characters in the data set except for the coding regions.
You will also restrict your analyses to all but five species of hominoids and three other primates species used as the outgroup taxa. The five hominoids (Homo sapiens, Pan, Gorilla, Pongo, and Hylobates) have already been identified in the sample file using the taxset command. In the same way that charset allows you to refer to a group of characters by a single name, taxset allows you to refer to a group of taxa by a single name.
Notice that spaces in taxon names must be replaced with an "_" (underscore character) or enclosed in single quotes when entered at the command-line. Also, PAUP* does not pay attention to the character case in taxa labels. Finally, be aware that when you exclude characters or delete taxa using the exclude and delete commands respectively (or the menu equivalents) you do not actually modify the data file. That is, the next time you execute the sample data set all of the characters and taxa will be included.
Before you begin an analysis there is a good chance that you know something about the characters in your data matrix, which might suggest that the characters should be differentially weighted. For example, we know that substitutions at the first codon position generally occur less frequently than substitutions at third positions. The simple explanation for this is that substitutions at first position codons usually result in an amino acid substitution; whereas, third-position changes can occur without changing the amino acid translation. You will incorporate this information into the following analysis by applying a higher weight to substitutions occurring at first position codons. Codon positions have already been identified in the sample file using the charset command.
By default, PAUP* considers all transformation costs to be equal. In this section, you will invoke a character type that will assign a higher weight to transversions than to transitions. More specifically, we will assume that transversions, changes from a purine (A or G) to pyrimidine (C or T), are two times the cost of transitions, changes from a purine to a purine and pyrimidine to a pyrimidine. One way to incorporate this assumption into the analysis is to set up a transition/transversion "step matrix”. Such a step matrix has already been defined in the sample file. To apply the transformation cost to all of the characters currently being considered, do the following:
Up to this point you have excluded characters, deleted taxa, weighted characters, and defined character transformation types. If for some reason you had to abandon your analyses and close PAUP*, you would have to select all of the menu options or repeat the commands previously entered to get back to where you are now. One way to avoid this potentially time-consuming task is to save your assumptions to a file that can be recalled at a later time.
Restart PAUP* and execute the file primate-mtDNA-interleaved.nex as you did in the beginning of the tutorial. Do the following to recall the previous set of assumptions:
You should now be back to where you started. To be sure the assumptions are in effect issue the command cstatus from the command-line. You should get the following output:
Character-status summary:
Current
optimality criterion = parsimony
205
characters are excluded
Of the
remaining 693 included characters:
All
characters are of user-defined type "2 1"
462
characters have weight 1
231
characters have weight 2
296
characters are constant
155
variable characters are parsimony-uninformative
Number
of (included) parsimony-informative characters = 242
PAUP* 4.0 has the advantage of being able to analyze data using several different optimality criteria; parsimony, likelihood, and distance. Several chapters in this manual and a plethora of published literature are devoted to comparing the performance of optimality criteria. Rather than spend time here discussing the relative merits of the available optimality criteria, we will just say that each criterion has its strengths and limitations. To begin with, you will use the default criterion, maximum parsimony, to search for optimal trees. Later in this tutorial you will search under the other criteria.
PAUP* provides two basic classes of methods for searching for optimal trees; exact and heuristic. Exact methods guarantee to find the optimal tree(s) but may require prohibitive amounts of computer time for medium to large-sized data sets. Heuristic methods do not guarantee optimality but generally require far less computer time. Even though the current data set is relatively small, you will start by conducting a heuristic search.
Once the search is started, PAUP* will display general information about the options and assumptions being used during the search. If you were logging results, this information would be saved to the log file. When the search completes, PAUP* will display general information about the results of the search.

According to the output on your screen, there is a single tree currently in memory. To display the tree do the following:
The showtrees command draws a simple picture of the branching order of the taxa.

Say for example, you want to know something about the branch lengths of the tree. To get a more detailed picture of the tree do the following:

/------------------------------------
Lemur catta
|
| /---- Homo
sapiens
|
|
|
/---------14 /------- Pan
| | \13
| /-----15 \--------- Gorilla
18 | |
| /------------16 \-------------- Pongo
| | |
+---------------17 \-------------- Hylobates
| |
| \-------------------- Macaca fuscata
|
\-----------------------------
Saimiri sciureus