OrthoFinder is a program for identifying orthologous protein sequence families. It is written in python and runs as a single command that takes as input a directory of FASTA files, one per species. It outputs a file containing the orthologous groups of genes from these species.

You can download the latest release of OrthoFinder by clicking on the link below


OrthoFinder requires the following to be installed in your path (see README file for more detailed instructions):
1. python together with the scipy libraries stack
3. The MCL graph clustering algorithm

Example Usage

OrthoFinder is run as a single command line command. For example if you had a directory on your computer called “/home/user/genomes” that contained protein FASTA files for 100 different species (one file per species) and you wanted to run OrthoFinder using 17 parallel processes, all you would need to type is

python orthofinder.py -f /home/user/genomes -t 17


Click here for links to downloads, install instructions and user manual


If you use OrthoFinder in your work please cite:

Emms, D. and Kelly, S. (2015). OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. GenomeBiology