pseudopipe pipeline (predict Pseudogenes)

shenzy@shenzy-pc:~/zhanglei/pgenes/ppipe_input/prodigal/input_file$ /usr/local/RepeatModeler/BuildDatabase -name tcs2 -engine ncbi /home/shenzy/zhanglei/pgenes/ppipe_input/prodigal/input_file/tcs2_prodigal.fa Building database tcs2: Adding /home/shenzy/zhanglei/pgenes/ppipe_input/prodigal/input_file/tcs2_prodigal.fa to database Number of sequences (bp) added to database: 4407 ( 4099764 bp ) shenzy@shenzy-pc:~/zhanglei/pgenes/ppipe_input/prodigal/input_file$ /usr/local/RepeatModeler/BuildDatabase -name tcs3 -engine ncbi /home/shenzy/zhanglei/pgenes/ppipe_input/prodigal/input_file/tcs3_prodigal.fa Building database tcs3: Adding /home/shenzy/zhanglei/pgenes/ppipe_input/prodigal/input_file/tcs3_prodigal.fa to database Number of sequences (bp) added to database: 4430 ( 4121190 bp ) shenzy@shenzy-pc:~/zhanglei/pgenes/ppipe_input/prodigal/input_file$ /usr/local/RepeatModeler/BuildDatabase -name […]

PRADA is a pipeline to analyze paired end RNA-Seq data to generate gene expression values (RPKM) and gene-fusion candidates.

PRADA PRADA Overview Description PRADA is a pipeline to analyze paired end RNA-Seq data to generate gene expression values (RPKM) and gene-fusion candidates. Development Information Language Python Current Version 1.1 Platforms Un*x (OpenPBS) License MIT Status Active Last Updated April 2013 References Citations No Formal Publications Help and Support Contact Roel Verhaak Discussion Project Forum […]

在linux系统终端里用amber的tleap生成top和crd文件的基本程序 – 【AMBER】 – 分子模拟论坛 Molecular Simulation Forums – Powered by haotui.com

ff99SBildn

via 在linux系统终端里用amber的tleap生成top和crd文件的基本程序 – 【AMBER】 – 分子模拟论坛 Molecular Simulation Forums – Powered by haotui.com.

pseudogene predict

shenzy@shenzy-ubuntu:~/Desktop/zhanglei/pgenes/pseudopipe/bin$ ./pseudopipe.sh /home/shenzy/Desktop/zhanglei/pgenes/ppipe_output/prodigal/tcs4/ /home/shenzy/Desktop/zhanglei/pgenes/ppipe_input/prodigal/tcs4/dna/tcs4_genome.fasta.masked /home/shenzy/Desktop/zhanglei/pgenes/ppipe_input/prodigal/tcs4/dna/tcs4_genome.%s.fasta /home/shenzy/Desktop/zhanglei/pgenes/ppipe_input/prodigal/tcs4/pep/tcs4_prodigal.pep /home/shenzy/Desktop/zhanglei/pgenes/ppipe_input/prodigal/tcs4/mysql/contig.%s_exlocs 0 outDir=/home/shenzy/Desktop/zhanglei/pgenes/ppipe_output/prodigal/tcs4 rmkDir=/home/shenzy/Desktop/zhanglei/pgenes/ppipe_input/prodigal/tcs4/dna/tcs4_genome.fasta.masked Making directories Copying sequences inputDNA=/home/shenzy/Desktop/zhanglei/pgenes/ppipe_input/prodigal/tcs4/dna/tcs4_genome.fasta.masked Fomatting the DNAs formatDB start……. Preparing the blast jobs Finished blast Processing blast output Finished processing blast output Running Pseudopipe on both strands Working on M strand Finished Pseudopipe on strand M Working on P strand Finished Pseudopipe on strand […]

KEGG annotation pipeline

KEGG Pathway Pipeline:

blastall -p blastp -d KEGG -i Haiyan.Pep.fasta -m 7 -a 10 -o Haiyan.Pep.fasta.blastp.m7 & ./tBLASTnParser.pl Haiyan.Pep.fasta.blastp.m7 Haiyan.Pep.fasta.blastp.m8 sed ‘1,1d’ Haiyan.Pep.fasta.blastp.m8 > Haiyan.Pep.fasta.blastp.m8.delhead

/home/zhouzh/lib/454-2.5/bin/runAssembly -m -cpu 16 -cdna -nobig -o Test sff/GV1NGBM02.sff

./draw_png.py -i ACYPIprot.KO.file -p /home/shenzy/KEGG/ko_org -o map_result5

step 1: /home/soft/blast-2.2.23/bin/blastall -p blastp -d KEGG -i MBL_relation.fa -a 15 -b 30 […]

Reordering contigs in draft genomes by MAUVE

When to use Mauve Contig Mover (MCM)

The Mauve Contig Mover (MCM) can be used to order a draft genome relative to a related reference genome. The functionality of this software module has been described in Rissman et al. 2009 , a publication in Bioinformatics. The Mauve Contig Mover can ease a comparative study between […]

PyroHMMvar: a sensitive and accurate method to call short INDELs and SNPs for Ion Torrent and 454 data

Motivation: The identification of short indels and SNPs from Ion Torrent and 454 reads is a challenging problem, essentially because these techniques are prone to sequence erroneously at homopolymers and can, therefore, raise indels in reads. Most of the existing mapping programs do not model homopolymer errors when aligning reads against the reference. The […]

mothur analysis pipeline script

trim.seqs(fasta=CN.fa,qfile=CN.quala, maxambig=0, maxhomop=8, flip=T, bdiffs=1, pdiffs=2, qwindowaverage=35, qwindowsize=50, processors=7) system(./grupp_CN.pl CN.trim.fasta > CN.groups) unique.seqs(fasta=CN.trim.fasta) align.seqs(fasta=CN.trim.unique.fasta, reference=gg.ref.fasta, processors=7,flip=T) summary.seqs(fasta=CN.trim.unique.align) screen.seqs(fasta=CN.trim.unique.align, name=CN.trim.names, group=CN.groups, end=4951,start=4655,minlength=65,processors=7) summary.seqs(fasta=current) filter.seqs(fasta=CN.trim.unique.good.align, vertical=T, trump=., processors=7) unique.seqs(fasta=CN.trim.unique.good.filter.fasta, name=CN.trim.good.names) pre.cluster(fasta=CN.trim.unique.good.filter.unique.fasta, name=CN.trim.unique.good.filter.names, group=CN.good.groups, diffs=1) chimera.uchime(fasta=CN.trim.unique.good.filter.unique.precluster.fasta, name=CN.trim.unique.good.filter.unique.precluster.names, group=CN.good.groups, processors=7) remove.seqs(accnos=CN.trim.unique.good.filter.unique.precluster.uchime.accnos, fasta=CN.trim.unique.good.filter.unique.precluster.fasta, name=CN.trim.unique.good.filter.unique.precluster.names, group=CN.good.groups) system(mv CN.trim.unique.good.filter.unique.precluster.pick.names CN.final.names) system(mv CN.trim.unique.good.filter.unique.precluster.pick.fasta CN.final.fasta) system(mv CN.good.pick.groups CN.final.groups) system(./gr_puhas CN.final.groups > CN.proovikaupa.groups) remove.groups(fasta=CN.final.fasta, […]

Install genometools

the ‘new’ error message refers to a nonexistant Cairo library on your system, which is needed for the AnnotationSketch component of GenomeTools. If you do not need this, do a ‘make cleanup’ and recompile with the additional make option ‘cairo=no’, e.g. ‘make errorcheck=no cairo=no’. This will disable support for AnnotationSketch and remove the cairo […]

利用tophat和Cufflinks做转录组差异表达分析的步骤详解

今天一个同学给我推荐一篇Nature Protocol 上文章,关于转录组差异表达分析。尚在正式通读之前习惯性浏览一遍图表,说实在这篇文章着实让我觉得有点“另类”。这是一篇活生生的利用Bowtie、tophat和Cufflinks做转录组差异表达分析的protocol。里面详细讲解每一步需要分析什么,用哪些些软件,已经相关命令和参数。

根据文章介绍的workflow,做转录组分析,无论是链特异性转录组数据(Strand-specific RNA-seq)还是非特异性数据,主要内容包括下面几个部分:

1)reads mapping,这里面推荐两款软件一个是Bowtie,另一个是tophat(此软件相对于Bowtie或者bwa,可以识别转录本的可变剪接)

2)转录组本组装(利用Cufflinks),转录本与已有基因组注释比较(利用Cuffcompare)、合并(利用Cuffmerge),转录组本差异表达分析(利用Cuffdiff)。

下面附上原文中的两张图片供大家快速预览转录组分析大致过程,其中图1是转录组分析中可能会用到的软件以及相关功能,图2:是转录本分析的一般流程。

图1

图2

关于转录组分析的相关软件在分析数据过程中的命令和参数,这里就不附加上来了,请大家直接阅读原文。

Cole Trapnell, Adam Roberts, Loyal Goff, Geo Pertea, Daehwan Kim, David R Kelley, Harold Pimentel, Steven L Salzberg, John L Rinn & Lior Pachter. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nature Protocols 7, 562–578 […]