Filter a MultiAssayExperiment object to keep a top percentage of taxa
Source:R/filter_MAE.R
filter_MAE.Rd
This function takes an animalcules-formatted MultiAssayExperiment
(MAE)
object and identifies all taxa at the OTU level of choice that exhibit a
relative abundance greater than or equal to a relative abundance percent
threshold, relabu_threshold
, in at least occur_pct_cutoff
% of
the total samples. After filtration, taxa across the specified OTU level and
all downstream levels are then consolidated into the category "Other".
Arguments
- dat
A
MultiAssayExperiment
object specially formatted as an animalcules output.- relabu_threshold
A double(percentage) between 0 and 100, representing the relative abundance criterion that all OTUs should meet to be retained. The smaller the threshold, the fewer the OTUs will be retained. Default is 3%.
- occur_pct_cutoff
A double (percentage) between 0 and 100 representing the percent cutoff for how many OTUs must meet the
relabu_threshold
across the samples to be retained. It is wise to keep the number of samples in mind when setting this parameter. Default is 5%.- taxon_level
Character string indicating the level of taxonomy to aggregate the counts data. Must be the name of a column in
MultiAssayExperiment::rowData(dat)
.
Examples
in_dat <- system.file("extdata/MAE_small.RDS", package = "LegATo") |>
readRDS()
filter_MAE(in_dat, relabu_threshold = 3, occur_pct_cutoff = 5,
taxon_level = "genus")
#> The overall range of relative abundance counts between samples is (590, 238823)
#> Number of OTUs that exhibit a relative abundance >3% in at least 5% of the total samples: 54/100
#> A MultiAssayExperiment object of 1 listed
#> experiment with a user-defined name and respective class.
#> Containing an ExperimentList class object of length 1:
#> [1] MicrobeGenetics: SummarizedExperiment with 54 rows and 50 columns
#> Functionality:
#> experiments() - obtain the ExperimentList instance
#> colData() - the primary/phenotype DataFrame
#> sampleMap() - the sample coordination DataFrame
#> `$`, `[`, `[[` - extract colData columns, subset, or experiment
#> *Format() - convert into a long or wide DataFrame
#> assays() - convert ExperimentList to a SimpleList of matrices
#> exportClass() - save data to flat files