This function is a wrapper for the Rsubread::buildindex
function. It
will generate one or more Subread indexes from a .fasta file. If the library
is too large (default >4GB) it will automatically be split into multiple
indexes, with _1, _2, etc at the end of the ref_lib basename.
Arguments
- ref_lib
The name/location of the reference library file, in (uncompressed) .fasta format.
- split
The maximum allowed size of the genome file (in GB). If the
ref_lib
file is larger than this, the function will split the library into multiple parts.- mem
The maximum amount of memory (in MB) that can be used by the index generation process (used by the Rsubread::buildindex function).
- quiet
Turns off most messages. Default is
TRUE
.
Value
Creates one or more Subread indexes for the supplied reference .fasta file. If multiple indexes are created, the libraries will be named the
ref_lib
basename + "_1", "_2", etc. The function returns the names of the
folders holding these files.
Examples
#### Create a subread index from the example reference library
## Create a temporary directory to store the reference library
ref_temp <- tempfile()
dir.create(ref_temp)
## Download reference genome
out_fasta <- download_refseq('Orthoebolavirus zairense', reference = FALSE,
representative = FALSE, out_dir = ref_temp,
compress = TRUE, patho_out = FALSE,
caching = TRUE)
#> No ENTREZ API key provided
#> Get one via taxize::use_entrez()
#> See https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/
#> No ENTREZ API key provided
#> Get one via taxize::use_entrez()
#> See https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/
## Make subread index of reference library
mk_subread_index(out_fasta)
#>
#> ========== _____ _ _ ____ _____ ______ _____
#> ===== / ____| | | | _ \| __ \| ____| /\ | __ \
#> ===== | (___ | | | | |_) | |__) | |__ / \ | | | |
#> ==== \___ \| | | | _ <| _ /| __| / /\ \ | | | |
#> ==== ____) | |__| | |_) | | \ \| |____ / ____ \| |__| |
#> ========== |_____/ \____/|____/|_| \_\______/_/ \_\_____/
#> Rsubread 2.18.0
#>
#> //================================= setting ==================================\\
#> || ||
#> || Index name : Orthoebolavirus_zairense ||
#> || Index space : base space ||
#> || Index split : no-split ||
#> || Repeat threshold : 100 repeats ||
#> || Gapped index : no ||
#> || ||
#> || Free / total memory : 197.8GB / 251.3GB ||
#> || ||
#> || Input files : 1 file in total ||
#> || o Orthoebolavirus_zairense.fasta.gz ||
#> || ||
#> \\============================================================================//
#>
#> //================================= Running ==================================\\
#> || ||
#> || Check the integrity of provided reference sequences ... ||
#> || No format issues were found ||
#> || Scan uninformative subreads in reference sequences ... ||
#> || Estimate the index size... ||
#> || 8%, 0 mins elapsed, rate=22.9k bps/s ||
#> || 16%, 0 mins elapsed, rate=45.7k bps/s ||
#> || 24%, 0 mins elapsed, rate=68.2k bps/s ||
#> || 33%, 0 mins elapsed, rate=90.7k bps/s ||
#> || 41%, 0 mins elapsed, rate=113.0k bps/s ||
#> || 49%, 0 mins elapsed, rate=135.0k bps/s ||
#> || 58%, 0 mins elapsed, rate=157.0k bps/s ||
#> || 66%, 0 mins elapsed, rate=178.7k bps/s ||
#> || 74%, 0 mins elapsed, rate=200.4k bps/s ||
#> || 3.0 GB of memory is needed for index building. ||
#> || Build the index... ||
#> || 8%, 0 mins elapsed, rate=1.8k bps/s ||
#> || 16%, 0 mins elapsed, rate=3.5k bps/s ||
#> || 24%, 0 mins elapsed, rate=5.3k bps/s ||
#> || 33%, 0 mins elapsed, rate=7.0k bps/s ||
#> || 41%, 0 mins elapsed, rate=8.8k bps/s ||
#> || 49%, 0 mins elapsed, rate=10.5k bps/s ||
#> || 58%, 0 mins elapsed, rate=12.3k bps/s ||
#> || 66%, 0 mins elapsed, rate=14.0k bps/s ||
#> || 74%, 0 mins elapsed, rate=15.8k bps/s ||
#> || Save current index block... ||
#> || [ 0.0% finished ] ||
#> || [ 10.0% finished ] ||
#> || [ 20.0% finished ] ||
#> || [ 30.0% finished ] ||
#> || [ 40.0% finished ] ||
#> || [ 50.0% finished ] ||
#> || [ 60.0% finished ] ||
#> || [ 70.0% finished ] ||
#> || [ 80.0% finished ] ||
#> || [ 90.0% finished ] ||
#> || [ 100.0% finished ] ||
#> || ||
#> || Total running time: 0.1 minutes. ||
#> ||Index /scratch/261666.1.ood/Rtmp4IgmVN/file3f4b0e4df919da/Orthoebolavi ... ||
#> || ||
#> \\============================================================================//
#>
#> [1] "/scratch/261666.1.ood/Rtmp4IgmVN/file3f4b0e4df919da/Orthoebolavirus_zairense_1"
unlink(ref_temp)