In this data rich world, visualisations are becoming increasingly important to illustrate, understand, and glean insight from the explosion of larger and larger datasets. Microbial ecology is an area revolutionizing how we look at health, microorganism diversity and habitats, and their ecological interactions, but these studies are proving challenging to handle with the ever-expanding number of specimens and communities sampled. New analysis tools are required to relate the distribution of microbes across these datasets, and integrate rich and standardized contextual information to better understand the biological factors driving these relationships. In a new Technical Note in GigaScience, Rob Knight from the University of Colorado at Boulder, USA, and colleagues, present a new tool aiming to do just this: EMPeror.
EMPeror brings a set of customizations and modifications that can be integrated into any dataset that is compliant with the open source software package QIIME (‘quantitative insights into microbial ecology’) – another creation from the Knight lab. QIIME enables the comparison and analysis of microbial communities and has been utilised in ambitious ‘megasequencing’ projects such as the Earth Microbiome Project (EMP), which aims to construct a microbial biomap of the planet by sampling hundreds of thousands of samples across multiple environment types, and American Gut, a crowdfunded study to compare gut microbiota in volunteers across global populations. These large scale initiatives have driven the need to be able share these outputs in an easily understandable and scalable manner.
QIIME has been integrated with a molecular graphics viewer, and by framing visualized outputs around the data generated by the Human Microbiome Project (HMP) this has enabled production of fascinating moving pictures of the microbiome, providing insights into how a newborn human gut microbiome develops over its first few years of life, and how the human gut can be restored after serious Clostridium difficile infection using fecal transplants.
The development of EMPeror, for QIIME compatible datasets, encompasses lightweight data files and hardware accelerated graphics that enable state-of-the-art analysis of N-dimensional data. It allows users to colour experimental metadata dynamically and separate colouring from visibility, helping to encourage interactive exploration, understanding and analysis, elucidate patterns hidden in the data, and structure the data such that it can be obtained much more easily. The outputs can be as small as 1.3 percent of the original file size, and are lightweight and simple enough to even be manipulated on a mobile device. The following video demonstrates a member of the Knight lab recreating Figure 1-B2 from their recent GigaScience Technical Note on his iPhone in just a few clicks and swipes:
As with other papers in GigaScience, to aid transparency and reusability the complete open peer review history is available to view, and supporting materials are available in the GigaScience GigaDB database, as well as on GitHub. More on microbiome research and EMPeror can be found on this GigaBlog.