ggtree package should not be viewed solely as a standalone software. While it is useful for viewing, annotating and manipulating phylogenetic trees, it is also an infrastructure that enables evolutionary evidences that inferred by commonly used software packages in the field to be used in
R. For instance, dN/dS values or ancestral sequences inferred by CODEML1, clade support values (posterior) inferred by BEAST2 and short read placement by EPA3 and pplacer4. These evolutionary evidences are not only used in annotating phylogenetic tree in
ggtree but can also be further analyzed in
Most of the tree viewer software (including
R packages) focus on
Nexus file formats, while there are file formats from different evolution analysis software that contain supporting evidences within the file that are ready for annotating a phylogenetic tree. The
ggtree package define several parser functions and
S4 classes to store statistical evidences inferred by commonly used software packages. It supports several file formats, including:
and software output from:
ggtree package implement several parser functions, including:
read.beastfor parsing output of BEASE
read.codemlfor parsing output of CODEML (
mlcfile (output of
read.hyphyfor parsing output of HYPHY
jplacefile including output from EPA and pplacer
NHXfile including output from PHYLODOG and RevBayes
rstfile (output of
read.r8sfor parsing output of r8s
read.raxmlfor parsing output of RAxML
ggtree defines several
S4 classes to store evolutionary evidences inferred by these software packages, including:
apeBootstrapfor bootstrap analysis of
ape::boot.phylo()10, output of
beastfor storing output of
codemlfor storing output of
codeml_mlcfor storing output of
hyphyfor storing output of
jplacefor storing output of
nhxfor storing output of
rstfile obtained by PAML, including
phangornfor storing ancestral sequences inferred by
phangorn11, output of
r8sfor storing output of
raxmlfor storing output of
jplace class is also designed to store user specified annotation data.
Here is an overview of these
ggtree also supports
multiPhylo (defined by
phylo4d (defined by
obkData (defined in
phyloseq (defined in
ggtree, tree objects can be merged and evidences inferred from different phylogenetic analyses can be combined or compared and visualized.
Viewing a phylogenetic tree in
ggtree is easy by using the command
ggtree(tree_object) and annotating a phylogenetic tree is simple by adding graphic layers using the grammar of graphics.
For each class, we defined
get.fields method to get the annotation features that available in the object that can be used to annotate a phylogenetic tree directly in
get.tree method can be used to convert tree object to
r8s) object that are widely supported by other
groupOTU method is used for clustering related OTUs (from tips to their most recent common ancestor). Related OTUs are not necessarily within a clade, they can be distantly related.
groupOTU works fine for monophyletic (clade), polyphyletic and paraphyletic, while
groupClade only works for clade (monophyletic). These methods are useful for clustering related OTUs or clades.
fortify method is used to convert tree object to a
data.frame which is familiar by
R users and easy to manipulate. The output
data.frame contains tree information and all evolutionary evidences (if available, e.g. dN/dS in
Detail descriptions of
slots defined in each class are documented in class man pages. Users can use
class?beast) to access man page of a class.
file <- system.file("extdata/BEAST", "beast_mcc.tree", package="ggtree") beast <- read.beast(file) beast
## 'beast' S4 object that stored information of ## '/tmp/RtmpNWH8ce/Rinst59a4280360d/ggtree/extdata/BEAST/beast_mcc.tree'. ## ## ...@ tree: ## Phylogenetic tree with 15 tips and 14 internal nodes. ## ## Tip labels: ## A_1995, B_1996, C_1995, D_1987, E_1996, F_1997, ... ## ## Rooted; includes branch lengths. ## ## with the following features available: ## 'height', 'height_0.95_HPD', 'height_median', 'height_range', 'length', ## 'length_0.95_HPD', 'length_median', 'length_range', 'posterior', 'rate', ## 'rate_0.95_HPD', 'rate_median', 'rate_range'.
% is not a valid character in
names, all the feature names that contain
x% will convert to
0.x. For example,
length_95%_HPD will be changed to
get.fields method return all available features that can be used for annotation.
##  "height" "height_0.95_HPD" "height_median" ##  "height_range" "length" "length_0.95_HPD" ##  "length_median" "length_range" "posterior" ##  "rate" "rate_0.95_HPD" "rate_median" ##  "rate_range"
Users can use
ggtree(beast) to visualize the tree and add layer to annotate it.
ggtree(beast, ndigits=2, branch.length = 'none') + geom_text(aes(x=branch, label=length_0.95_HPD), vjust=-.5, color='firebrick')