Differential abundance test — amp_diffabund • ampvis2extras

Tests if there is a significant difference in abundances between samples or groups hereof based on selected conditions. Returns a list containing test results as well as two different plots; an MA-plot and an abundance plot with taxa with the most significant p-value (below the threshold).

amp_diffabund(data, group)

Arguments

data	(required) Data list as loaded with `amp_load()`.
group	(required) A categorical variable in the metadata that defines the sample groups to test.
test	The name of the test to use, either `"Wald} or \code{"LRT`. See `DESeq`. (default: `"Wald"`)
fitType	The type of fitting of dispersions to the mean intensity, either `"parametric"`, `"local"`, or `"mean"`. (default: `"parametric"`)
num_threads	The number of threads to use for parallelization by the `BiocParallel` backend. Parallelization is not supported on windows machines. (default: `1`)
signif_thrh	Significance threshold. (default: `0.01`)
fold	Log2fold filter for displaying significant results. (default: `0`)
verbose	(Logical) Whether to print status messages during the test calculations. (Default: `TRUE`)
signif_plot_type	Either `"boxplot"` or `"point"`. (default: `"point"`)
plot_nshow	The amount of the most significant results to display in the most-significant plot. (default: `10`)
plot_point_size	The size of the plotted points. (default: `2`)
tax_aggregate	The taxonomic level to aggregate the OTUs. (default: `"OTU"`)
tax_add	Additional taxonomic level(s) to display, e.g. `"Phylum"`. (default: `NULL`)
tax_class	Converts a specific phylum to class level instead, e.g. `"p__Proteobacteria"`.
tax_empty	How to show OTUs without taxonomic information. One of the following: `"remove"`: Remove OTUs without taxonomic information. `"best"`: (default) Use the best classification possible. `"OTU"`: Display the OTU name.
adjust_zero	Keep 0 abundances in ggplot2 median calculations by adding a small constant to these.
...	Additional arguments passed on to `DESeq`.

Value

A list with multiple elements:

"DESeq2_results": The raw output result from DESeq.
"DESeq2_results_signif": The raw output result from DESeq, but subset to only taxa with p-value below the threshold set by signif_thrh.
"signif_plotdata": The data used to generate the ggplots, but subset to only taxa with p-value below the threshold set by signif_thrh.
"Clean_results": A simpler version of DESeq2_results_signif only with adjusted p-values, log2FoldChange, and average abundance of each taxa per group.
"plot_MA": MA-plot
"plot_MA_plotly": Interactive plotly plot of MA-plot with custom hover information.
"plot_signif": Abundance plot with taxa with the n most significant p-value (below the threshold), where n is set by plot_nshow.
"plot_signif_plotly": Interactive plotly plot of plot_signif with custom hover information.

Author

Kasper Skytte Andersen kasperskytteandersen@gmail.com

Mads Albertsen MadsAlbertsen85@gmail.com

Examples

library(ampvis2extras)
# Load example data
data("AalborgWWTPs")

# Subset to a few taxa, save the results in an object
d <- amp_subset_taxa(AalborgWWTPs, tax_vector = c("p__Chloroflexi", "p__Actinobacteria"))
#> 7584 OTUs have been filtered 
#> Before: 9430 OTUs
#> After: 1846 OTUs
results <- amp_diffabund(d, group = "Plant", tax_aggregate = "Genus")
#> Running DESeq2 differential abundance test. This may take a while depending on the size of the data. 
#> ---------------------------------
#> estimating size factors
#> estimating dispersions
#> gene-wise dispersion estimates
#> mean-dispersion relationship
#> final dispersion estimates
#> fitting model and testing
#> -- replacing outliers and refitting for 6 genes
#> -- DESeq argument 'minReplicatesForReplace' = 7 
#> -- original counts are preserved in counts(dds)
#> estimating dispersions
#> fitting model and testing
#> ---------------------------------
#> Done. Generating plots.
#> Warning: Transformation introduced infinite values in continuous x-axis

# Show plots
results$plot_signif
results$plot_MA
#> Warning: Transformation introduced infinite values in continuous x-axis

# Or show raw results
results$Clean_results
#> # A tibble: 810 × 5
#> # Groups:   Taxonomy, padj [810]
#>    Taxonomy                        padj Log2FC Aalborg_East Aalborg_West
#>    <chr>                          <dbl>  <dbl>        <dbl>        <dbl>
#>  1 o__AKYG1722_OTU_154          8.6e-83   11          0.001        2.73 
#>  2 f__Microbacteriaceae_OTU_640 2.1e-34   -3          0.178        0.015
#>  3 Actinomyces                  2.6e-33   -1.8        0.734        0.146
#>  4 f__mle1-48_OTU_467           1.4e-32    2.8        0.027        0.136
#>  5 o__Micrococcales_OTU_10052   3.6e-31    2.8        0.049        0.246
#>  6 Illumatobacter               1.1e-27    2.4        0.029        0.114
#>  7 c__Actinobacteria_OTU_297    2  e-27    4.4        0.011        0.186
#>  8 o__419_OTU_401               6.7e-27   -5.7        0.133        0.001
#>  9 c__SJA-15_OTU_926            4.4e-26    4.3        0.013        0.16 
#> 10 f__Anaerolineaceae_OTU_1308  4  e-23    3.9        0.008        0.08 
#> # … with 800 more rows