微生物多样研究—差异分析

1. 随机森林模型 随机森林是一种基于决策树(Decisiontree)的高效的机器学习算法,可以用于对样本进行分类(Classification),也可以用于回归分析(Regression)。 它属于非线性分类器,因此可以挖掘变量之间复杂的非线性的相互依赖关系。通过随机森林分析,可以找出能够区分两组样本间差异关键OTU。 Feature Importance Scores表格-来源于随机森林结果

记录了各OTU对组间差异的贡献值大小。

18585978-a855cbdb5a069bb1
注:一般地,选取Mean_decrease_in_accuracy值大于0.05的OTU,作进一步分[……]

Read more

[…]

ANOSIM,PERMANOVA/Adonis,MRPP (转贴)

6634703-6fde5e9bfb6489c7

1. ANOSIM 组间相似性分析

  • 相似性分析(ANOSIM)是一种非参数检验,用来检验组间(两组或多组)的差异是否显著大于组内差异,从而判断分组是否有意义。首先利用 Bray-Curtis 算法计算两两样品间的距离,然后将所有距离从小到大进行排序, 按以下公式计算 R 值,之后将样品进行置换,重新计算 R值,R大于 R 的概率即为 P 值。

6634703-ec94fa34c56b542a6634703-01bd752421e6028e

注:图上总共有 N+1 个盒子,N 为分组数量。“Between”的盒子指代的是分组之间的差异,其他分别代表各自组 内差异。R 值范围为-1 到+1,实际中 R 值一般从 0 到 1。R 值接近 1 表示组间差异[……]

Read more

[…]

Adonis与ANOSIM检验究竟是什么?(转贴)

做微生物16S测序的时候,公司的报告里经常会给到两种检验Adonis和ANOSIM,听过t.test、wilicox、anova各种检验,那么Adonis和ANOSIM检验是什么呢

Adonis 多元方差分析

Adonis,多元方差分析,亦可称为非参数多元方差分析。其原理是利用距离矩阵(比如基于Bray-Curtis距离、Euclidean距离)对总方差进行分解,分析不同分组因素对样品差异的解释度,并使用置换检验对其统计学意义进行显著性分析。 Adonis分析结果通常如下: Index Df SumsOfSqs MeanSqs F.Model R2[……]

Read more

[…]

alpha多样性

扩增子数据分析之多样性指数: alpha多样性

多样性指数(Diversity index)和计算公式可以见: wikipedia Alpha多样性(Alpha Diversity)是对某个样品中物种多样性的分析,包含样品中的物种类别的多样性——丰富度(Richness)和物种组成多少的整体分布——均匀度(Evenness)两个因素,通常用Richness,Chao1,Shannon,Simpson,Dominance和Equitability等指数来评估样本的物种多样性。 丰富度指数 Richness, Chao1,Shannon三个指数是常用的评估丰富度[……]

Read more

[…]

Multivariate analyses in R (PERMANOVA )

https://rpubs.com/collnell/manova

Multivariate analyses in R

By C Nell

Types of questions

Do groups differ in composition? Does community structure vary among regions or over time? Do environmental variables explain community patterns? Which species are responsible for differences among g[……]

Read more

[…]

Correlation tests, correlation matrix, and corresponding visualization methods in R (forward)

https://rstudio-pubs-static.s3.amazonaws.com/240657_5157ff98e8204c358b2118fa69162e18.html

Read more

[…]

Correlation analysis (zhuantie)

Read more

[…]

How to Compare Regression Slopes

How to Compare Regression Slopes

If you perform linear regression analysis, you might need to compare different regression lines to see if their constants and slope coefficients are different. Imagine there is an established r[……]

Read more

[…]

Size Matters: Metabolic Rate and Longevity (Regression analysis sample)

Size Matters: Metabolic Rate and Longevity

John Tukey once said, “The best thing about being a statistician is that you get to play in everyone’s backyard.” I enthusiastically agree! I frequently enjoy reading and watching science-related material. This invariably raises questions, involving oth[……]

Read more

[…]

数据分析之美:如何进行回归分析

1. 确定自变量与Y是否相关

证明:自变量X1,X2,….XP中至少存在一个自变量与因变量Y相关
For any given value of n(观测数据的数目) and p(自变量X的数目), any statistical software  package can be used to compute the p-value associated with the F-statistic using this distribution. Based on this p-value, we can determine whether or not to reject H0. (用[……]

Read more

[…]