xenoGI: reconstructing the history of genomic island insertions in clades of microbes
Microbes have acquired many important traits through the horizontal transfer of genomic islands. Understanding the evolution of these traits often requires us to understand the adaptive path that has produced them.
The goal of xenoGI is to reconstruct the history of genomic island insertions in a clade of closely related microbes. It takes as input a set of sequenced genomes and a tree specifying their phylogenetic relationships. It then identifies genomic islands and maps their origin on the phylogenetic tree, determining which branch they inserted on.
The key challenge in this problem is to accurately identify the origin of genes. Every gene in the input genomes must have one of two origins. Either it is a core gene present in the most recent common ancestor of the strains, or it arrived via a horizontal transfer event. The algorithm seeks to determine which is which by creating gene families in a way that takes account of both the species tree and synteny information. It then identifies families whose members are adjacent and whose most recent common ancestor is shared, and merges them into islands reflecting a common origin.
The figure below shows an example of this sort of analysis related to acid tolerance in the Escherichia clade. gadB is a glutamate decarboxylase enzyme known to be involved in acid tolerance in E. coli. In an analysis of eleven enteric species, xenoGI finds that gadB is part of an island of eight genes that inserted on the branch leading to Escherichia, before the divergence of E. fergusonii.
The fact that it operates in the context of a clade makes xenoGI distinctive compared with previous genomic island finding methods. Other distinctive features include the fact that it is gene based, doesn't depend on the aligner MAUVE, and integrates species tree and synteny information from an early stage of its analysis.
In the past, reconstructing the history of GI insertions into a clade typically required heavy human involvement. xenoGI provides an automated solution to this problem. Beyond this, a thorough comparative analysis is the gold standard for genomic island finding (even if one's goal is only to find islands in only a single genome). Our hope is that xenoGI will make this sort of analysis accessible to more users.
If you use xenoGI in a publication, please cite the reference below:
Bush EC, Clark AE, DeRanek CA, Eng A, Forman J, Heath K, Lee AB, Stoebel DM, Wang Z, Wilber M, Wu H. xenoGI: reconstructing the history of genomic island insertions in clades of closely related bacteria. BMC Bioinformatics. 19(32). 2018. [article] [bioRxiv preprint]