In biology, DNA sequencing technology has become a “disruptive innovation“, and has opened up many new possibilities for science, with wide-ranging impacts on society. Disruptive innovations emerge when they effectively displace current technologies, and in microbiology, genome sequencing is now being used in epidemiology, for instance allowing scientists in public health laboratories to trace infections.
Identifying the source of a microbial infection needs to be done rapidly and accurately, and sequencing technology certainly helps in this. But while the generation of the data is no longer a bottleneck, the analysis of such large sets of data does create problems. The computational power required to analyse hundreds, or even thousands of samples will commonly exceed that which is available to many scientists.
A new study led by Dr Arnoud van Vliet of the Gut Health and Food Safety Research Programme at the IFR has tackled this, and shows that a previously described software program, called Feature Frequency Profiling (FFP), can be utilised for rapid analysis of such large genome datasets without the need for expensive computer equipment.
The human stomach bacterium Helicobacter pylori was chosen to test whether the software gave comparable outputs to currently used software analyses, as H. pylori is one of the most challenging organisms for these analyses due to high levels of variability in its chromosome. On a small dataset, the FFP software gave the same outputs as the other programs currently used, and after this success was used to compare 377 H. pylori genomes for ancestry, geography and the presence of known factors involved in disease and antimicrobial resistance. Although such analyses had been done on a small scale before, the analysis done in this paper was only limited by the availability of genome sequences, and was performed using a standard MS-Windows desktop PC. It now opens up possibilities for analysis of much larger datasets of microbial genome sequences, such as those available for the foodborne bacterial pathogens Campylobacter, Listeria, Clostridium botulinum and Salmonella, which are all studied at IFR.
Dr. van Vliet commented “Many scientists working with genome sequences struggle with the computer hardware requirements for analysing large datasets. In our study we have shown that such analyses can be done on a standard desktop computer, and hence opens up the field for many scientists currently shirking away from this kind of work. We chose Helicobacter pylori as it is known to be really challenging for genome analyses, and were really pleased to see that the analysis methods stood up to the challenge. This suggests that we can do these types of analyses with many other bacteria of interest, and look forward to the field further developing.”
This work has been published in The Journal of Clinical Microbiology, one of the leading journals in clinical microbiology, and has immediately attracted more than 1,000 views per month after its early access publication in September 2015. The paper is Gold open access, with paper and data fully available without charge.
[Box]Reference: van Vliet AHM, Kusters JG (2015) Use of Alignment-Free Phylogenetics for Rapid Genome Sequence-Based Typing of Helicobacter pylori Virulence Markers and Antibiotic Susceptibility. Journal of Clinical Microbiology 53, 2877-88. doi: 10.1128/JCM.01357-15.[/Box]
[Box]Funding: This study was funded by the Biotechnology and Biological Sciences Research Council (BBSRC).[/Box]