Statistical Analysis of Phylogenetic Trees and Networks
The study of Evolution and understanding the history of life on Earth, has attracted scientists’ attention since the dawn of modern science. Phylogenetics, the reconstruction of this history, has expanded beyond its traditional role in evolutionary studies and with the advent of next generation sequencing is increasingly integrated into modern biological areas such as preventive medicine, epidemiology, and human migration. Maximum Likelihood (ML) is considered as the most accurate phylogenetic method. Our lab has developed expertise in analytical solutions to ML phylogenetic reconstruction. The novelty of our approach is the representation of the ML problem as a constraint optimization problem and using algebraic geometry tools for obtaining the solution. The work [2] appeared in RECOMB, the world’s most important bioinformatics conference and have since sparked a wave of interest in this approach, in particular among mathematicians from UC Berkeley where I later took on a position
as a postdoctoral research fellow. The work relies on the Hadamard Conjugation [8, 7], a very elegant mathematical tool in phylogenetics. As part of this project, together with Mike Hendy, the inventor of the “Hadamard”, we provided a combinatorial proof for the tool for the family of Kimura models of DNA evolution [9]. It is said in Joe Felsenstein’s book [5], the “bible” of phylogenetics, on the hadamard conjugation, “This is one of the nicest applications of mathematics to phylogenetics so far”! We are honored to be part, although tiny, of this line of research.
Later on, we extended this framework to analyze phylogenetic networks.
Analytical ML is still our method of choice and a prevailing approach in almost every project we undertake in genomics.
-
B. Chor and S. Snir, Molecular Clock Fork Phylogenies: Closed Form Analytic Maximum Likelihood Solutions , Systematic Biology , Vol. 53, Issue 6, December 2004, pp. 963--967.
-
B. Chor, M. Hendy and S. Snir, Maximum Likelihood Jukes-Cantor Triplets: Analytic Solutions . Molecular Biology and Evolution (MBE), Vol. 23, Issue 3 , March 2006, pp. 626-632
-
B. Chor, A. Khetan and S. Snir, Maximum Likelihood on Molecular Clock Comb: Analytic Solutions , Journal of Computational Biology (JCB). Vol. 13, Issue 3 , April 2006, pp. 819--837
-
M. Hendy and S. Snir, Hadamard Conjugation for the Kimura 3ST Model: Combinatorial Proof using Path-Sets , IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 5(3): 461-471 (2008).
-
G. Jin L. Nakhleh, S. Snir and T. Tuller, Maximum Likelihood of Phylogenetic Networks , Bioinformatics, Vol 22, Number 21, November 2006, pages 2604 - 2611. All authors contributed equally.
-
S. Snir and T. Tuller, The NET-HMM Approach: Phylogenetic Network Inference by Combining Maximum Likelihood and Hidden Markov Models. Journal of Bioinformatics and Computational Biology (JBCB), 2009 Aug;7(4):625-44. Authors contributed equally.
-
J. Felsenstein. Inferring Phylogenies. Sinauer Associates, 2003.
-
M. D. Hendy. The relationship between simple evolutionary tree models and observable sequence data. Syst. Zool., 38:310–321, 1989.
-
M. D. Hendy and D. Penny. A framework for the quantitative study of evolutionary trees. Syst. Zool., 38:297–309, 1989.