Using refgenie with iGenomes
If you're already using iGenomes, it's easy to configure
refgenie to use your existing folder structure. iGenomes is project that provides sequences and annotation files for commonly analyzed organisms. Each iGenome is available as a compressed file that contains sequences and annotation files for a single genomic build of an organism.
Initialize a refgenie config file if you don't have one you want to use for your iGenomes assets:
export REFGENIE='igenome_config.yaml' refgenie init -c $REFGENIE
And then you have two options:
This command line tool is distributed with
refgenie and is ready to use after installing
refgenie. It adds all the assets enclosed in the genome archive downloaded from the iGenomes website to the
refgenie local asset inventory. The required inputs are:
-g: name of the genome that should be assigned to the assets,
-p: a path to the downloaded archive or a directory (unarchived iGenomes folder).
$ import_igenome -h usage: import_igenome [-h] -p PATH -g GENOME [-c CONFIG] Integrates every asset from the downloaded iGenomes tarball/directory with Refgenie asset management system optional arguments: -h, --help show this help message and exit -p PATH, --path PATH path to the desired genome tarball or directory to integrate -g GENOME, --genome GENOME name to be assigned to the selected genome -c CONFIG, --config CONFIG path to local genome configuration file. Optional if 'REFGENIE' environment variable is set.
$ import_igenome -g staph -p Staphylococcus_aureus_NCTC_8325_NCBI_2006-02-13.tar.gz Extracting 'Staphylococcus_aureus_NCTC_8325_NCBI_2006-02-13.tar.gz' Moved 'Staphylococcus_aureus_NCTC_8325_NCBI_2006-02-13.tar.gz' to '/Users/mstolarczyk/Desktop/testing/test_genomes/staph' Added assets: - staph/Chromosomes - staph/BWAIndex - staph/BowtieIndex - staph/AbundantSequences - staph/Bowtie2Index - staph/WholeGenomeFasta
You can also add individual assets you want
refgenie to track with
refgenie add. This way of iGenomes integration with
refgenie is useful if you do not plan to add all of the assets for the downloaded iGenome. It is also useful beyond iGenomes, since you can technically add whatever assets you want, from whatever sources, into your refgenie.
refgenie add genome/asset -p RELATIVE_PATH
So, after downloading an archive from iGenomes website:
tar -xf Staphylococcus_aureus_NCTC_8325_NCBI_2006-02-13.tar.gz refgenie add staph/bowtie2_index \ -p Staphylococcus_aureus_NCTC_8325/NCBI/2006-02-13/Sequence/Bowtie2Index
Now we can
seek any added assets:
refgenie seek staph/BWAIndex
remove unwanted/faulty ones:
refgenie remove staph/BWAIndex
This way you can configure
refgenie to use your iGenomes assets, so you can wean yourself off of the iGenomes hard structure and transition to the refgenie-managed path system.