R绘图基础(二)点柱图(dot histogram)

在之前的一节当中,图型名称有些混乱,从这一节开始将做如下统一(不全面):

英文名称 中文名称 bar 条形图 line 线图 area 面积图 pie 饼图 high-low 高低图 pareto 帕累托图 control 控制图 boxplot 箱线图 error bar 误差条图 scatter 散点图 P-P P-P正态概率图 Q-Q Q-Q正态概率图 sequence 序列图 ROC Curve ROC分类效果曲线图 Time Series 时间序列图

好了,言归正传。那么什么又是点柱图(dot histogram)呢?之前我又称之为蜂群图(beeswarm)。还有称之为抖点图(jitter plots)。总之无论如何,在糗世界里我都称之为点柱图吧。

我们先看点柱图效果:

点柱图

以下是代码

> require(beeswarm) > data(breast) > head(breast) ER ESR1 ERBB2 time_survival event_survival 100.CEL.gz neg […]

R绘图基础(一)布局颜色等

一,布局

R绘图所占的区域,被分成两大部分,一是外围边距,一是绘图区域。

外围边距可使用par()函数中的oma来进行设置。比如oma=c(4,3,2,1),就是指外围边距分别为下边距:4行,左边距3行,上边距2行,右边距1行。很明显这个设置顺序是从x轴开始顺时针方向。这里的行是指可以显示1行普通字体。所以当我们使用mtext中的line参数时,设置的大小就应该是[0,行数)的开区间。当我们使用mtext在外围边距上书写内容时,设置mtext中的outer=TRUE即可。

绘图区域可使用par()函数中的mfrow, mfcol来进行布局。mfrow和mfcol可以使用绘图区域被区分为多个区域。默认值为mfrow(1,1)。

比如mfrow(2,3)就是指将绘图区域分成2行3列,并按行的顺序依次绘图填充; 比如mfcol(3,2)就是指将绘图区域分成3行2列,并按列的顺序依次绘图填充;

我们将每一个细分的绘图区域分为两个部分,一是绘图边距,一是主绘图。

绘图边距需要容纳的内容有坐标轴,坐标轴标签,标题。通常来讲,我们都只需要一个x轴,一个y轴,所以在设置时,一般的下边距和左边距都会大一些。如果多个x轴或者y轴,才考虑将上边距或者右边距放大一些。绘图边距可以使用par()函数中mar来设置。比如mar=c(4,3,2,1),与外围边距的设置类似,是指绘图边距分别为下边距:4行,左边距3行,上边距2行,右边距1行。很明显这个设置顺序是从x轴开始顺时针方向。行的概念与之前的相同。也可以使用mai来设置。mai与mar唯一不同之处在于mai不是以行为单位,而是以inch为单位。

SOUTH<-1; WEST<-2; NORTH<-3; EAST<-4; GenericFigure <- function(ID, size1, size2) { plot(0:10, 0:10, type=”n”, xlab=”X”, ylab=”Y”) text(5,5, ID, col=”red”, cex=size1) box(“plot”, col=”red”) mtext(paste(“cex”,size2,sep=””), SOUTH, line=3, adj=1.0, cex=size2, col=”blue”) title(paste(“title”,ID,sep=””)) } MultipleFigures <- function() { GenericFigure(“1″, 3, 0.5) box(“figure”, lty=”dotted”, col=”blue”) GenericFigure(“2″, 3, 1) box(“figure”, lty=”dotted”, col=”blue”) GenericFigure(“3″, […]

Python 自建标准差函数

def stdDeviation(a): l=len(a) m=sum(a)/l d=0 for i in a:

d+=(i-m)**2 return (d*(1/l))**0.5

a=[5,6,8,9] print(stdDeviation(a)) ======== 1.5811388300841898

454 pyrosequencing analysis pipeline

mothur > sffinfo(sff=454Reads_archaea.sff, flow=T) Extracting info from 454Reads_archaea.sff … 10000 20000 30000 40000 50000 60000 70000 80000 90000 92115 It took 68 secs to extract 92115. Output File Names: 454Reads_archaea.fasta 454Reads_archaea.qual 454Reads_archaea.flow

mothur > trim.flows(flow=454Reads_archaea.flow, oligos=oligos_LXY.txt, pdiffs=2, bdiffs=1, processors=2) Appending files from process 15674

Output File Names: 454Reads_archaea.trim.flow 454Reads_archaea.scrap.flow 454Reads_archaea.GZ_ARC.flow 454Reads_archaea.GZ1122_ARC.flow 454Reads_archaea.GZ1122cellulose_ARC.flow 454Reads_archaea.GZ_xylan_ARC.flow 454Reads_archaea.GZ_cellulose55_ARC.flow 454Reads_archaea.SHX_xylan_ARC.flow […]

solve the problem of “read-only file system”

Originally Posted by prabhatsoni When I mount it by giving “mount -t vfat -o rw /dev/sdb1 /media/disk”, it mounts. But again when I try to delete a file, It gives the error “read-only file system”.

Give the mount command w/o any options a see whether it says it is mounted ro. If so, check […]

An R package Suite for Microarray Meta-analysis in Quality Control, Differentially Expressed Gene Analysis and Pathway Enrichment Detection

An R package Suite for Microarray Meta-analysis in Quality Control, Differentially Expressed Gene Analysis and Pathway Enrichment Detection Abstract

Summary: With the rapid advances and prevalence of high-throughput genomic technologies, integrating information of multiple relevant genomic studies has brought new challenges. Microarray meta-analysis has become a frequently used tool in biomedical research. Little effort, […]

Solve problem for /usr/bin/ld: cannot find -lgfortran

gcc -std=gnu99 -shared -o vegan.so cepin.o data2hill.o decorana.o goffactor.o monoMDS.o nestedness.o ordering.o pnpoly.o stepacross.o vegdist.o -lgfortran -lm -lquadmath -L/usr/lib/R/lib -lR /usr/bin/ld: cannot find -lgfortran collect2: ld returned 1 exit status make: *** [vegan.so] Error 1

After installing libgfortran3, you should rename libgfortran.so.3 to libgfortran.so

mv /usr/lib/libgfortran.so.3 /usr/lib/libgfortran.so

在linux环境编译应用程式或lib的source code时常常会出现如下的错误讯息:

/usr/bin/ld: cannot find -lxxx

这些讯息会随着编译不同类型的source […]

Install mcaGUI problem

To install this package, start R and enter:

source(“http://bioconductor.org/biocLite.R”) biocLite(“mcaGUI”) Note: R version >= 2.13 ……………………..

g++ -shared -L/usr/local/lib64 -o ShortRead.so Biostrings_stubs.o IRanges_stubs.o R_init_ShortRead.o alphabet.o io.o io_bowtie.o io_soap.o pileup.o readBfaToc.o read_maq_map.o sampler.o util.o xsnap.o -lz 安装至 /home/shenzy/R/x86_64-pc-linux-gnu-library/R-2.13.1/library/ShortRead/libs ** R ** inst ** preparing package for lazy loading

Attaching package: ‘IRanges’

The following object(s) are masked […]

MetaPhlAn: Metagenomic Phylogenetic Analysis

MetaPhlAn is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data. MetaPhlAn relies on unique clade-specific marker genes identified from 3,000 reference genomes, allowing:

up to 25,000 reads-per-second (on one CPU) analysis speed (orders of magnitude faster compared to existing methods); unambiguous taxonomic assignments as the MetaPhlAn markers are […]

DySC: software for greedy clustering of 16S rRNA reads

Summary: Pyrosequencing technologies are frequently used for sequencing the 16S ribosomal RNA marker gene for profiling microbial communities. Clustering of the produced reads is an important but time-consuming task. We present Dynamic Seed-based Clustering (DySC), a new tool based on the greedy clustering approach that uses a dynamic seeding strategy. Evaluations based on the […]