Skip to contents

This is the main MetaScope target library mapping function, using Rbowtie2 and multiple libraries. Aligns to each library separately, filters unmapped reads from each file, and then merges and sorts the .bam files from each library into one output file. If desired, output can be passed to `filter_host_bowtie()` to remove reads that also map to filter library genomes.

Usage

align_target_bowtie(
  read1,
  read2 = NULL,
  lib_dir,
  libs,
  align_dir,
  align_file,
  bowtie2_options = NULL,
  threads = 1,
  overwrite = FALSE,
  quiet = TRUE
)

Arguments

read1

Path to the .fastq file to align.

read2

Optional: Location of the mate pair .fastq file to align.

lib_dir

Path to the directory that contains the Bowtie2 indexes.

libs

The basename of the Bowtie2 indexes to align against (without trailing .bt2 or .bt2l extensions).

align_dir

Path to the directory where the output alignment file should be created.

align_file

The basename of the output alignment file (without trailing .bam extension).

bowtie2_options

Optional: Additional parameters that can be passed to the align_target_bowtie() function. To see all the available parameters use Rbowtie2::bowtie2_usage(). See Details for default parameters. NOTE: Users should pass all their parameters as one string and if optional parameters are given then the user is responsible for entering all the parameters to be used by Bowtie2. The only parameter that should NOT be specified here is the number of threads.

threads

The number of threads that can be utilized by the function. Default is 1 thread.

overwrite

Whether existing files should be overwritten. Default is FALSE.

quiet

Turns off most messages. Default is TRUE.

Value

Returns the path to where the output alignment file is stored.

Details

The default parameters are the same that PathoScope 2.0 uses. "--very-sensitive-local -k 100 --score-min L,20,1.0"

If you experience any issues with reading the input files, make sure that the file(s) are not located in a read-only folder. This can be circumvented by copying files to a new location before running the function.

Examples

#### Align example reads to an example reference library using Rbowtie2

## Create temporary directory to store file
target_ref_temp <- tempfile()
dir.create(target_ref_temp)

## Dowload reference genome
MetaScope::download_refseq("Morbillivirus hominis",
                           reference = FALSE,
                           representative = FALSE,
                           compress = TRUE,
                           out_dir = target_ref_temp,
                           caching = TRUE
)
#> No ENTREZ API key provided
#>  Get one via taxize::use_entrez()
#> See https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/
#> No ENTREZ API key provided
#>  Get one via taxize::use_entrez()
#> See https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/
#> [1] "/scratch/261666.1.ood/Rtmp4IgmVN/file3f4b0e4af837c8/Morbillivirus_hominis.fasta.gz"

## Create temporary directory to store the indices
index_temp <- tempfile()
dir.create(index_temp)

## Create bowtie2 index
MetaScope::mk_bowtie_index(
  ref_dir = target_ref_temp,
  lib_dir = index_temp,
  lib_name = "target",
  overwrite = TRUE
)
#> arguments 'show.output.on.console', 'minimized' and 'invisible' are for Windows only
#> Index building complete
#> [1] "/scratch/261666.1.ood/Rtmp4IgmVN/file3f4b0e1ded6139"

## Create temporary directory for final file
output_temp <- tempfile()
dir.create(output_temp)

## Get path to example reads
readPath <- system.file("extdata", "virus_example.fastq",
                        package = "MetaScope")

## Align to target genomes
target_map <-
  MetaScope::align_target_bowtie(
    read1 = readPath,
    lib_dir = index_temp,
    libs = "target",
    align_dir = output_temp,
    align_file = "bowtie_target",
    overwrite = TRUE,
    bowtie2_options = "--very-sensitive-local"
  )
#> [1] "Samtools found on system. Using samtools to create bam file"
#> arguments 'show.output.on.console', 'minimized' and 'invisible' are for Windows only

## Remove extra folders
unlink(target_ref_temp, recursive = TRUE)
unlink(index_temp, recursive = TRUE)
unlink(output_temp, recursive = TRUE)