Loading taxonomy ================ Every data file you'll load (eg. fasta, gff, blast) must belong to an organism. These are instructions to load the NCBI taxonomy, that contains most organisms you'll need. **This step is optional.** If you decide to skip this step, you should insert the organisms individually using the command *insert_organism*. NCBI Taxonomy ------------- Contains the names and phylogenetic lineages of more than 160,000 organisms that have molecular data in the NCBI databases. * **URL**: https://www.ncbi.nlm.nih.gov/taxonomy * **File**: ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz After unpacking the file you'll have the required files. .. code-block:: bash python manage.py load_organism --file names.dmp --name DB:NCBI_taxonomy python manage.py load_phylotree --file nodes.dmp --name 'NCBI taxonomy tree' --organismdb 'DB:NCBI_taxonomy' * Loading these files can be faster if you increase the number of threads (--cpu). * It will take a long time anyway (hours). Remove taxonomy --------------- If, by any reason, you need to remove a taxonomy you should use the command *remove_phylotree* and *remove_organisms*. Most data files you'll load depend on the organism database (eg. fasta, gff, blast). **If you delete an organism database, every data file you loaded will be deleted on cascade**. .. code-block:: bash python manage.py remove_phylotree --help python manage.py remove_organisms --help * These commands require the names of the databases (Db.name)