Tools and methods for constructing the Tree of Life

Environment Department, University of York, 19th and 20th December, 2016

If you wish to register a place on the workshop, please fill in this form.

Evolutionary history, or phylogeny, is the backbone of systematics and knowledge of phylogeny is essential in a variety of fields within evolutionary biology; such as macroevolution, palaeontology, evolutionary ecology, and conservation. Non-specialists are increasingly in need of large, inclusive phylogenies for which there are two main methods of construction: the supermatrix and the supertree. For both methods, the main issues are collecting, curating and processing the source data. The individual sources will have individual tips and character data that do not match other sources due to misspellings, synonyms, or use of higher-level taxa. How do you correct this for hundreds or thousands of data sources? How do you efficiently collate hundreds of data sources? Supertrees combine a number of these overlapping source trees (source data) to then create the “supertree”. In contrast supermatrices take primary information from characters (including genes or morphological characters) and combine them into a single, large matrix. Both methods can be cumbersome and time-consuming when creating large phylogenies. And, less obviously but no less vital, how do you even begin to visualise your output of 1000s tips in any meaningful way?

A number of tools now exist to help researchers process, explore, augment and visualise their data. This workshop will cover a number of the techniques that allow researchers to create large phylogenies quickly and accurately. The workshop will cover a wide range of possible tools, before focusing on open-source and free tools that are the best available for creating large phylogenies quickly, easily and robustly. We will cover content mining for phylogenetic data using ContentMine tools, supertree construction with the Supertree Toolkit, supermatrix and phylogeny construction using phyloGenerator, adding fossil data and uses of large phylogeny in macroevolution, and efficient tools for mining and utilising other ancillary data and visualising outputs. Underpinning these tools is the concept of software engineering and sustainable software. In learning about these tools, the participants will also learn best practice in software development – a key skill in modern systematics. All software presented is available under open-source licences.

The workshop will be held over two days at the University of York in the Environment building. The first day will consist of lectures and talks covering all aspects of building the Tree of Life, from obtaining data, to building the tree, to adding additional data onto the tree.

Ross Mounce Post-Doc University of Cambridge UK ContentMine tools: mining images and texts for phylogenetic and species-related information
Katie Davis Research Fellow University of York UK Supertree Toolkit: fast and accurate supertree construction
Will Pearse Lecturer Utah State University USA phyloGenerator: fast and accurate phylogeny construction
Graeme Lloyd Research Fellow University of Leeds UK Macroevolution and fossil data in the Tree of Life
Jon Hill Lecturer University of York UK Efficient tools for pre- and post-processing data in the Tree of Life

The second day will be a hands-on practical session with the teaching team on the first day helping participants with their research problems, building on the material and tools covered in day 1. We ask participants to install some software beforehand.

The workshop team have a diverse range of expertise; Dr Ross Mounce is a bioinformatician, Dr Katie Davis is an evolutionary biologist, Dr Graeme Lloyd is a numerical palaeontologist, Dr Jon Hill is a numerical environmental modeller, and Dr Will Pearse is an evolutionary biologist and ecologist.

