Rapid identification of Mycobacterium tuberculosis var bovis transmission clusters

Project Summary

Bovine tuberculosis is the form of tuberculosis that infects cattle and is caused by the bacteria Mycobacterium bovis. Tracking transmission of this infection is central to trying to eliminate this disease globally. The main way we do this is using the complete DNA sequence of Mycobacterium bovis strains extracted from cattle suffering from the disease. Analysing this data can be very computationally intensive, making it unfeasible in many low resource settings and countries.This project aims to create a computer pipeline for linking transmission cases together using easy, fast and computationally light computer programs. We will look at some freely available programs and test them on a dataset of 400 Mycobacterium bovis samples with the aim of validating an approach to detecting transmission that can be used on a laptop.

Full Project Description

Mycobacterium tuberculosis is a bacterial pathogen that infects over 10 million people worldwide. A variant of this pathogen, termed Mycobacterium tuberculosis var bovis (previously named Mycobacterium bovis and hereafter referred to as such) is the primary cause of Bovine Tuberculosis (BTB) worldwide. In the UK alone 44,656 cattle were slaughtered due to BTB in 2018. This variant also causes about 2% of human TB cases, due to zoonotic transmission. A large part of the eradication strategy for this pathogen is rapid and accurate detection of transmission events to better assess and implement interventions as needed.
Tracking M. bovis transmission, and indeed M. tuberculosis as a whole, requires whole genome sequencing (WGS) of isolates. This involves bioinformatics steps including genome assembly, single nucleotide polymorphism (SNP) calling, and construction of transmission clusters based on SNP distances. This pipeline can be computationally intensive (16GB of RAM or more) and lengthy (2-3 days per 50 isolates), making it impossible for researchers in low resources settings, where BTB is often found, to analyse their own WGS data and enact public/agricultural health interventions based on their findings. 
This project will explore the possibility of detecting transmission clusters of M. bovis without the need for WGS assembly and SNP calling. A variety of assembly-free computational tools such as PopPUNK and Phylonium will be tested against the gold standard of WGS SNP-defined clusters. These methods use a quick search approach to cluster strains based on shared DNA sequences but have never been tested on M. bovis before. Descriptive statistics on accuracy and ease of use will be gathered for each method and those that detect similar clusters to the SNP approach will be analysed for further refinement and to improve accuracy.A dataset of over 400 M. bovis publicly available genomes will be used for this analysis, with expansion to other M. tuberculosis datasets if time allows. Outcomes from this work will feed directly into a larger project within the lab, looking at improving transmission detection for both BTB and human TB globally. The outcome of the project will be a report on the accuracy of each assembly-free tool with recommendations for further avenues of research
All necessary training will be provided so no previous bioinformatics/statistics knowledge is needed. 

 

Biotechnology and Biological Sciences Doctoral Training Programme

The University of Nottingham
University Park
Nottingham, NG7 2RD

Tel: +44 (0) 115 8466946
Email: bbdtp@nottingham.ac.uk