Solexa与Hiseq测序技术中常见术语名词解释

第二代测序技术中Solexa以及它的升级版Hiseq,目前使用最多。为了帮助PLoB网友进一步了解Solexa相关的概念。与大家分享一篇网上看到的文章《Solexa测序技术中常见术语解释》,文章后面有参考来源链接。更多相关信息欢迎加入PLoB 2000人的生物信息QQ群(群号:235461986)来讨论,有相关测序以及生物信息学问题需要解答欢迎前来。下面直接附上相关的解释。大家同时可以结合上面的示意图,了解Solexa与Hiseq的基本结构。

SBS:边合成边测序反应,每次SBS会延伸一个碱基,大约耗时70分钟。

Run:单次上机测序反应,可以产生4G-75G测序通量不等。

Lane:单泳道,每条泳道可以直接物理区分测序样品,1次run最多可以同时上样8条Lane。

Channel:Lane的同义词。

Tile:小区,每条Lane中排有2列tile,合计120个小区。每个小区上分布数目繁多的簇结合位点。

Cluster:簇,在Solexa测序技术中会采用桥式PCR方式生产DNA簇,每个DNA簇才能产生亮度达到CCD可以分辨的荧光点。

Index:标签,在Solexa多重测序(Multiplexed Sequencing)过程中会使用Index来区分样品,并在常规测序完成后,针对Index部分额外进行7个循环的测序,通过Index的识别,可以在1条Lane中区分12种不同的样品。

Barcode: Index同义词

Fasta:一种序列存储格式。一个序列文件若以FASTA格式存储,则每一条序列的第一行以“>”开 头,而跟随“>”的是序列的ID号(即唯一的标识符)及对该序列的描述信息;第二行开始是序列内容,序列短于61nt的,则一行排列完;序列长于 61nt的,则每行存储61nt,最后剩下小于61nt的,在最后一行排列完;第二条序列另起一行,仍然由“>”和序列的ID号开始,以此类推。

Fastq:Fastq是Solexa测序技术中一种反映测序序列的碱基质量的文件格式。第一行以“@”符号开头,后面紧跟一个序列的描述信息;第二行是该序列的内容;第三行以“+”符号开头,后面紧跟的内容与第一行一样,同样是该序列的描述信息;而第四行是第二行中的序列内容每个碱基所对应的测序质量值。

PF%:PF%是指符合测序质量标准的簇的百分比(Multiplexed Sequencing),与测序的通量相关联。

Read:Solexa是成簇反应的,每个簇对应一条DNA序列片段,成为一个read。

名词解释与图片的参考来源:http://www.igenomics.com.cn:7001/ajgene/jsp/ajweb/News.jsp?cid=C47825F27EC00001B8BF8B8D11C01D10

[…]

Illumina MiSeq 与GS FLX/Junior、Ion Torrent PGM性能比较

Illumina MiSeq 与GS FLX/Junior性能比较表

Illumina MiSeq

GS FLX/Junior

实验流程和周期 提供最快的二代测序的实验流程, 可在8小时内完成从DNA样本其实到分析后的数据,比GS FLX/Junior快5倍。流程包括:l 文库制备:1.5小时,使用快速、transposon-based Nextera方法

l 在一个仪器系统内、以不到4.5小时(1 X 36 bp)的时间完成从自动话的簇生成到测序

l 在同一个仪器系统内,以不到2小时的时间完成初级和次级测序数据分析

l 2 X 150 bp运行约需27小时* GS FLX/Junior 完整实验流程需要几天,包括:l 建库: 1 天

l emPCR: off-instrument and labor-intensive, 2-3 天手工操作

l 测序:10 hours

l 初级和次级测序数据分析8小时(GS FLX), >2 小时(GS Junior) 通量 最高通量的个人化测序仪:l 每次运行可产出1-1.5 […]

RDP Tutorials (16s Analysis)

Contents

 

Workflows:

Processing 16S rRNA data using a unsupervised method

Processing 16S rRNA data using a supervised method

Processing functional gene data using a supervised method

Individual tools:

Using the Pipeline Initial Process

Align 16S rRNA sequences using Infernal Aligner

Using the RDP Classifier

Using the RDP MultiClassifier

Performing Complete Linkage Clustering

–Using the […]

RAD-SEQ 测序

Rainbow v2.0

Rainbow package consists of several programs used for RAD-seq related clustering and de novo assembly.

http://sourceforge.net/projects/bio-rainbow/files/

 

Motivation: The innovation of Restriction site Associated DNA sequencing (RAD-seq) method takes full advantage of next-generation sequencing technology. By clustering paired-end short reads into groups with their own unique tags, RAD-seq assembly problem is divided […]

RazerS 3: Faster, fully sensitive read mapping

Motivation: During the last years NGS sequencing has become a key technology for many applications in the biomedical sciences. Throughput continues to increase and new protocols provide longer reads than currently available. In almost all applications, read mapping is a first step. Hence, it is crucial to have algorithms and implementations that perform fast, […]

Qualimap: evaluating next generation sequencing alignment data

Motivation: The sequence alignment/map (SAM) and the binary alignment/map (BAM) formats have become the standard method of representation of nucleotide sequence alignments for next-generation sequencing data. SAM/BAM files usually contain information from tens to hundreds of millions of reads. Often, the sequencing technology, protocol, and/or the selected mapping algorithm introduce some unwanted biases in […]

MetaPhlAn: Metagenomic Phylogenetic Analysis

MetaPhlAn is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data. MetaPhlAn relies on unique clade-specific marker genes identified from 3,000 reference genomes, allowing:

up to 25,000 reads-per-second (on one CPU) analysis speed (orders of magnitude faster compared to existing methods); unambiguous taxonomic assignments as the MetaPhlAn markers are […]

DySC: software for greedy clustering of 16S rRNA reads

Summary: Pyrosequencing technologies are frequently used for sequencing the 16S ribosomal RNA marker gene for profiling microbial communities. Clustering of the produced reads is an important but time-consuming task. We present Dynamic Seed-based Clustering (DySC), a new tool based on the greedy clustering approach that uses a dynamic seeding strategy. Evaluations based on the […]

TaxCollector: Modifying Current 16S rRNA Databases for the Rapid Classification at Six Taxonomic Levels

Our project TaxCollector has been published in MPDI Diversity.

Abstract

The high level of conservation of 16S ribosomal RNA gene (16S rRNA) in all Prokaryotes makes this gene an ideal tool for the rapid identification and classification of these microorganisms. Databases such as the Ribosomal Database Project II (RDP-II) and the Greengenes Project offer access […]

Microbial Community Analysis GUI–Bioconducter

http://www.bioconductor.org/packages/release/bioc/html/mcaGUI.html

mcaGUI Microbial Community Analysis GUI

Bioconductor version: Release (2.10)

Microbial community analysis GUI for R using gWidgets.

Author: Wade K. Copeland, Vandhana Krishnan, Daniel Beck, Matt Settles, James Foster, Kyu-Chul Cho, Mitch Day, Roxana Hickey, Ursel M.E. Schutte, Xia Zhou, Chris Williams, Larry J. Forney, Zaid Abdo, Poor Man’s GUI (PMG) base code by […]