Data Sources

The DataSource interface is responsible of providing the package with data regarding species trees and gene sequences. In order to use your own data you can extend the abstract class AbstractDataSource and the abstract tag found in the def.xml file. For more information see section Create a Tag Instance.

The class DataSourceXmlNcbiTaxonomy is using the NCBI Taxonomy database to extract a species tree for the gene family. The tag is <data_source did="xml_ncbi_taxonomy" ...> and contains following attributes.

XML Sequence Data

The SequenceDataXml is the class handling the XML detail in the sequence data file. The tag is called <sequence_data did="xml" ...> and contains following attributes.

For all tags but the main and the sequence tag, it is possible to define a template and a marker which is useful if it is necessary to extract data from the tag string. The tag names of the XML detail are case sensitive. For more information see def.xml or the API.

In order to implement your own sequence parser you can extend the abstract class AbstractSequenceData . For more information see section Create a Tag Instance and the API.