batch download genome based on accession

Three easy ways to download multiple sequences from NCBI

There are different ways of how to download multiple sequences from the NCBI databases in a single request.   1) Using the batch Entrez website
http://www.>ncbi.nlm.nih.gov/sites/batchentrez

  2) Using Perl: (cop[……]

Read more

[…]

生信小工具专题:BBTools/BBMap Suite 的使用 (zhuantie)

链接:https://www.jianshu.com/p/175c3282a61c BBMap/BBTools是一种用于DNA和RNA测序reads的拼接感知全局的比对工具。它可以处理的reads包括,Illumina,454,Sanger,Ion Torrent,Pac Bio和Nanopore。 BBMap快速且极其准确,特别是对于高度突变的基因组或长插入读数,甚至超过100kbp长的全基因缺失。它对基因组大小或contigs数量没有上限。并且已经成功地用于绘制到具有超过2亿个contigs的85gb的土壤宏基因组。此外,与其他比对工具相比,索引阶段非常快。 BBMap[……]

Read more

[…]

how-to-extract-convert-gff3-cds-sequences-to-multifasta

https://bioinformatics.stackexchange.com/questions/2341/how-to-extract-convert-gff3-cds-sequences-to-multifasta

Using python and this GFF parser that mimics Biopython’s SeqIO parsers:
from BCBio import GFF # Read the gff for seq in GFF.parse('my_file.gff'): # only focus on t[......]

Read more

[…]

python and PBS script

#!/bin/bash #PBS -l walltime=48:00:00,nodes=8:ppn=4 #PBS -N bbmap_batch #PBS -l walltime=48:00:00 #megahit -1 /disk/rdisk09/zhiyshen/combined_HKG_1.fastq.gz -2 /disk/rdisk09/zhiyshen/combined_HKG_2.fastq.gz -m 0.9 -o /disk/rdisk09/zhiyshen/HKG_coassembly_out –min-contig-len 2000 -t 28 source a[……]

Read more

[…]

R heatmap for ANI

data<-read.table("fastani_matrix.txt", header=TRUE )

(base) zyshen@wyq-P310:~/work/deltaBS/fastaANI$ more fastani_matrix.txt B1147 E4385 E4742 E4930 EC5350 ROAR019 B1147 100 94.681488 94.696358 94.648102 99.785301 94.803284 E4385 94.681488 100 96.50248 97.408295 94.496964 95.691849 E[……]

Read more

[…]