Skip to contents

This function calculates the best hit (genome with most blast read hits), uniqueness score (total number of genomes hit), species percentage hit (percentage of reads where MetaScope species also matched the blast hit species), genus percentage hit (percentage of reads where blast genus matched MetaScope aligned genus) and species contaminant score (percentage of reads that blasted to other species genomes) and genus contaminant score (percentage of reads that blasted to other genus genomes)

Usage

blast_result_metrics(
  blast_results_table_path,
  accessions_path,
  db = NULL,
  NCBI_key = NULL
)

Arguments

blast_results_table_path

path for blast results csv file

accessions_path

Directory where accession files for blast are stored.

db

Currently accepts one of c("ncbi", "silva", "other") Default is "ncbi", appropriate for samples aligned against indices compiled from NCBI whole genome databases. Alternatively, usage of an alternate database (like Greengenes2) should be specified with "other".

NCBI_key

(character) NCBI Entrez API key. optional. See taxize::use_entrez(). Due to the high number of requests made to NCBI, this function will be less prone to errors if you obtain an NCBI key.

Value

a dataframe with best_hit, uniqueness_score, species_percentage_hit genus_percentage_hit, species_contaminant_score, and genus_contaminant_score