Phylogenetics Tutorial for Biochemists

Phylogenetics is appealing to biochemists for understanding how proteins evolve, but it can be intimidating.

Where do I start? How do I create and curate a sequence alignment? What model(s) should I use when building trees, and what are these models doing anyway? How do I evaluate the quality of a tree or of ancestrally reconstructed proteins? What’s with all of these obnoxious file formats?

If you’re asking yourself any or all of these questions, this tutorial is for you. We will walk through phylogenetic analysis from start to finish.

We use python and `jupyter notebooks`(https://github.com/jupyter/notebook) to interface with established phylogenetic packages. If you are new to using these, you can quickly get up and running [here](https://python-for-scientists.readthedocs.io/en/latest/).

Three basic steps for phylogenetic analysis:

  • gathering and aligning sequences
  • building and evaluating phylogenetic trees
  • reconstructing ancestral proteins

Commonly asked questions and pitfalls:

  • managing data and file formats with [phylopandas](https://github.com/Zsailer/phylopandas)
  • [evaluating and curating alignments]()
  • [choosing evolutionary models]()
  • [including extra information in tree-building (secondary structure, etc.)]()

If you have any additional questions, [please ask]()!