This vignette explains how to conduct automated morphological character partitioning as a pre-processing step for clock (time-calibrated) Bayesian phylogenetic analysis of morphological data, as introduced by Simões and Pierce (2021).
install.packages("EvoPhylo") ### OR ::install_github("tiago-simoes/EvoPhylo")devtools
Load the EvoPhylo package
Generate a Gower distance matrix with
get_gower_dist() by supplying the file path of a .nex file containing a character data matrix:
#Load a character data matrix and produce a Gower distance matrix <- get_gower_dist("DataMatrix.nex", numeric = FALSE)dist_matrix
Below, we use the example data matrix
characters that accompanies
data(characters) <- get_gower_dist(characters, numeric = FALSE)dist_matrix
The optimal number of partitions (clusters) will be first determined using partitioning around medoids (PAM) with Silhouette widths index (Si) using
get_sil_widths(). The latter will estimate the quality of each PAM cluster proposal relative to other potential clusters.
## Estimate and plot number of cluster against silhouette width <- get_sil_widths(dist_matrix, max.k = 10) sw plot(sw, color = "blue", size = 1)
Decide on number of clusters based on plot; here, \(k = 3\) partitions appears optimal.
3.1. Analyze clusters with PAM under chosen \(k\) value (from Si) with
3.2. Produce simple cluster graph
3.3. Export clusters/partitions to Nexus file with
## Generate and vizualize clusters with PAM under chosen k value. <- make_clusters(dist_matrix, k = 3) clusters plot(clusters)
## Write clusters to Nexus file cluster_to_nexus(clusters, file = "Clusters_Nexus.txt")
4.1. Analyze clusters with PAM under chosen \(k\) value (from Si) with
4.2. Produce a graphic clustering (tSNEs), coloring data points according to PAM clusters, to independently verify PAM clustering. This is set with the
tsne argument within
4.3. Export clusters/partitions to Nexus file with
cluster_to_nexus(). This can be copied and pasted into the Mr. Bayes command block.
#User may also generate clusters with PAM and produce a graphic clustering (tSNEs) <- make_clusters(dist_matrix, k = 3, tsne = TRUE, tsne_dim = 3) clusters plot(clusters, nrow = 2, max.overlaps = 5)
#Write clusters/partitions in Nexus file format cluster_to_nexus(clusters, file = "Clusters_Nexus.txt")