High CPU/RAM usage and Parallel(MPI) jobs submitted sample scripts

for Single CPU Job

zyshen@e2lx001:/home/zyshen> cat /disk/rdisk03/test_1cpu.sh #!/bin/csh # # #PBS -l nodes=1:ppn=1 # #******** Script for job submission with single CPU #******** The lines above must be included in any job script #******** ‘nodes’ is number of nodes and ‘ppn’ is number of CPU per node, which are both ‘=1′ in this case. # […]

ELPH : Estimated Locations of Pattern Hits

ELPH : Estimated Locations of Pattern Hits

Overview

ELPH is a general-purpose Gibbs sampler for finding motifs in a set of DNA or protein sequences. The program takes as input a set containing anywhere from a few dozen to thousands of sequences, and searches through them for the most common motif, […]

usually bioinformatics tools

http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/

This directory contains applications for stand-alone use, built specifically for a Linux 64-bit machine. For help on the bigBed and bigWig applications see: http://genome.ucsc.edu/goldenPath/help/bigBed.html http://genome.ucsc.edu/goldenPath/help/bigWig.html View the file ‘FOOTER’ to see the usage statement for each of the applications. Name Last modified Size Description Parent Directory – FOOTER 12-Jun-2012 18:01 65K bedClip 12-Jun-2012 18:01 […]

LOCAS, a new NGS assembler particularly designed for low coverage assembly of eukaryotic genome

Next Generation Sequencing (NGS) is a frequently applied approach to detect sequence variationsbetween highly related genomes. Recent large-scale re-sequencing studies as the Human 1000 GenomesProject utilize NGS data of low coverage to afford sequencing of hundreds of individuals. Here, SNPsand micro-indels can be detected by applying an alignment-consensus approach. However,computational methods capable of […]

ANNOVAR: Functional annotation of genetic variants from high-throughput sequencing data

Description

ANNOVA is an efficient software tool to utilize update-to-date information to functionally annotategenetic variants detected from diverse genomes. Given a list of variants with chromosome, startposition, end position and observed nucleotides, ANNOVAR can identify whether SNPs or indels causeprotein coding changes and what is the amino acids that were changed, or identify variants […]

ubuntu 9.10 下安装glimmer3失败及其解决办法

(1)安装前,先修改src/Common下第26行#include <string> 为 #include <cstring>;再make,可是仍然会产生以下错误 * Make Target is all ##### Making Directory /usr/local/glimmer3.02/src/Common all ##### make[1]: 正在进入目录 `/usr/local/glimmer3.02/src/Common’ @@@@@@@@@@@@@@@@@@@ delcher.cc @@@@@@@@@@@@@@@@@@@@@ @@@@@@@@@@@@@@@@@@@ fasta.cc @@@@@@@@@@@@@@@@@@@@@ @@@@@@@@@@@@@@@@@@@ gene.cc @@@@@@@@@@@@@@@@@@@@@ gene.cc: In member function ‘void PWM_t::Print(FILE*)’: gene.cc:263: warning: deprecated conversion from string constant to ‘char*’ gene.cc: In function ‘int Char_Sub(char)’: gene.cc:448: error: invalid conversion from ‘const char*’ […]

ABACAS: Algorithm Based Automatic Contiguation of Assembled Sequences

http://abacas.sourceforge.net/index.html

ABACAS is intended to rapidly contiguate (align, order, orientate), visualize and design primers to close gaps on shotgun assembled contigs based on a reference sequence.

ABACAS uses MUMmer to find alignment positions and identify syntenies of assembled contigs against the reference. The output is then processed to generate a pseudomolecule taking overlapping […]

ABBA: Assembly Boosted By Amino acid sequences

ABBA From amos Jump to: navigation, search

ABBA: Assembly Boosted By Amino acid sequences

Contents

[hide] 1 Overview 2 Download 3 References 4 Acknowledgements

Overview

Assembly Boosted By Amino acid sequence is a comparative gene assembler, which uses amino acid sequences from predicted proteins to help build a better assembly. see the journal paper.

[…]

one command line for getting consensus sequences from bam file

samtools mpileup -uf ref.fa aln.bam | bcftools view -cg – | vcfutils.pl vcf2fq > cns.fq

Merging separate sequence and quality files to FASTQ

#!/usr/bin/perl -w use strict; use Bio::SeqIO; use Bio::Seq::Quality; use Getopt::Long; die “pass a fasta and a fasta-quality file\n” unless @ARGV; my ($seq_infile,$qual_infile) = (scalar @ARGV == 1) ?($ARGV[0], “$ARGV[0].qual”) : @ARGV; ## Create input objects for both a seq (fasta) and qual file my $in_seq_obj = Bio::SeqIO->new( -file => $seq_infile, -format => ‘fasta’, ); my […]