Softparsmap - Manual | ||
---|---|---|
<<< Previous | Next >>> |
Tree nodes and trees in Softparsmap are represented by five classes, Node, GeneNode, GeneLeaf, SpeciesNode, and SpeciesLeaf. These classes are linked together in order to create gene and species trees. The class names reflect if the node is an internal node or a leaf. Species leaves and gene leaves are mapped (one-to-many) since each gene sequence (represented by the class GeneLeaf ) is found in a species (represented by the class SpeciesLeaf). The species tree is a subtree of the Tree of Life such that every species represented by the species leaves harbor at least one gene sequence found in the gene tree. Before it is possible to map gene nodes onto species nodes in the species tree, every gene leaf has to be added to one species leaf.
A reference to a node has two different meanings depending on the context. Either the reference is used in the context of the node alone, or in the context of a rooted subtree. For instance
Set children = someNode.getChildren(); |
someNode
, while
Set leaves = someNode.getLeaves(); |
someNode
is the root.
Dependent on the method used when creating gene trees, different types of edges are created. The interface EdgeType defines the properties regarding edge types that this package needs. Every gene node has a reference to an instance of this interface. In order to create your own edge type, extend AbstractEdgeType and the abstract edge tag found in def.xml. For more information see section Create a Tag Instance.
StandardEdgeType is the standard edge type in Softparsmap and has following attributes.
short_name
- which is the short name used when
printing tree tables (should be 1-3 letters long)
value_limit
- defines an edge value limit such
that any edge value below this limit is defined
weak, else defined strong.
divide_value
- determines if the edge value should
be divided when making a mid-point re-root
There are three methods in the class
Node used to
print trees to the prompt,
toStringTree(), toStringTable(),
and
toStringAll()
.
These methods can be used to print
species trees as well as gene trees. Each node in the tree has
a label and the row in the table with the same label has all
available data for that node. If a cell contains '-' it means that
the data for that node is not available and if a column is missing
it means that the whole tree is missing that data. Here is an example
on what a species tree and a rooted gene tree looks like after
inferring mutation.
+--(9606)- (-9347)| +-(10090)- +-------+-------------+--------------------------------+-----------------+ | Label | Class | Seq | Species name | +-------+-------------+--------------------------------+-----------------+ | -9347 | SpeciesNode | - | placentals | | 10090 | SpeciesLeaf | [20809742, 15126606, 12860621] | transgenic mice | | 9606 | SpeciesLeaf | [15012045, 16550688] | man | +-------+-------------+--------------------------------+-----------------+ |
Seq
contains the sequences that exists in a certain species. The
column Species name
is the name of the species.
+----------------(15126606)- | | +--(12860621)- | +--(-6)| (-92)| | +--(16550688)- +--(-5)| | +--(20809742)- +--(-4)| +--(15012045)- +----------+----------+---------+---------------+------+---------------+-------+ | Label | Class | E.v. | M(g) | N.i. | SL(g) | m(g) | +----------+----------+---------+---------------+------+---------------+-------+ | -4 | GeneNode | 0.94-UN | [9606, 10090] | - | [10090, 9606] | -9347 | | -5 | GeneNode | 1.0-UN | [9606, 10090] | D1 | [10090, 9606] | -9347 | | -6 | GeneNode | 0.92-UN | [9606, 10090] | - | [10090, 9606] | -9347 | | -92 | GeneNode | NaN-UN | - | D1L1 | - | - | | 12860621 | GeneLeaf | 1.0-UN | - | - | [10090] | 10090 | | 15012045 | GeneLeaf | 1.0-UN | - | - | [9606] | 9606 | | 15126606 | GeneLeaf | 1.0-UN | [10090] | - | [10090] | 10090 | | 16550688 | GeneLeaf | 1.0-UN | - | - | [9606] | 9606 | | 20809742 | GeneLeaf | 1.0-UN | - | - | [10090] | 10090 | +----------+----------+---------+---------------+------+---------------+-------+ |
Label
- is the label found in the tree.
Class
- is the name of the class used to
represent this node.
E.v.
- is the edge value and the short name
for the edge type.
M(g)
- used to infer mutation. See
[1] for more information.
N.i.
- contains other node information. At the
most it consists of four parts. D1L2WC
means
that the node has one duplication (D1
), two
losses (L2
), is weak (W
),
and the collapse flag is set to true (C
).
SL(g)
- used to infer mutation. In
[1] it is called Z(g).
m(g)
- used to infer mutation. See
[1] for more information.
<<< Previous | Home | Next >>> |
String Templates | Tree Parsers |