Metagenomics standard operating procedure v2

Note that this workflow is continually being updated. If you want to use the below commands be sure to keep track of them locally.

Last updated: 9 Oct 2018 (see revisions above for earlier minor versions and Metagenomics SOPs on the SOP for earlier major versions)

Important points to keep in mind:

The […]

Vipie: web pipeline for parallel characterization of viral populations from multiple NGS samples


Next generation sequencing (NGS) technology allows laboratories to investigate virome composition in clinical and environmental samples in a culture-independent way. There is a need for bioinformatic tools capable of parallel processing of virome sequencing data by exactly identical methods: this is especially important in studies of multifactorial diseases, or in parallel comparison of laboratory […]

Utilities / Create consensus sequence from a BAM file

Utilities / Create consensus sequence from a BAM file Description

Given an indexed BAM file and corresponding reference genome (in fasta format), this tool constructs a consensus sequence based on the alignment.




This tool uses SAMtools, bcftools and script to create a consensus sequence for the given alignment file. The actual […]

Good software

multiqc ranger ployly

Data Binning and Plotting

In statistics, data binning is a way to categorize a number of continuous values into a smaller number of buckets (bins). Each bucket defines an numerical interval. For example, if there is a variable about house-based education levels which are measured by continuous values ranged between 0 and 19, data binning will place each value […]

ChIPseeker for ChIP peak annotation (转贴)

ChIPpeakAnno WAS the only R package for ChIP peak annotation. I used it for annotating peak in my recent study.

I found it does not consider the strand information of genes. I reported the bug to the authors, but they are reluctant to change.

So I decided to develop my own package, ChIPseeker, and […]

Mapping reads with bwa and bowtie

In this tutorial, we’re going to take a set of Illumina reads from an inbred Drosophila melanogaster line, and map them back to the reference genome. (After these steps, we could do things like generate a list of SNPs at which this line differs from the reference strain, or generate a genome sequence for this […]


[Originally posted by Kat on her BacPathGenomics blog, April 2013]

This is a shameless plug for an article and accompanying tutorial I’ve just published together with David Edwards, my excellent MSc Bioinformatics student from the University of Melbourne. It’s currently available as a PDF pre-pub from BMC Microbial Informatics and Experimentation, but the web version […]


Well, after almost 6 years, our Klebsiella pneumoniae genomics paper is finally out!

It’s a beast of a thing and there are still a million and one questions to address just from this one data set. For those interested in looking at the data for themselves, the raw reads are available under accessionERP000165, the […]


Yesterday I spoke at a workshop for JAMS TOAST (Sydney’s Joint Academic Microbiology Seminars – bioinformatics workshop)… I was asked to cover tools for comparative genomics, so I put together a list of the tried and tested programs that I find most useful for this kind of analysis. So here is the list.