Quantcast
Channel: Post Feed
Viewing all articles
Browse latest Browse all 41826

Looking For A Command Line Version Of Ncbi Taxonomy 'Stratification' Tool

$
0
0

Hi all

I am building a pipeline for taxonomically identifying a blast result at each taxa level, with a bespoke reference dataset. I want it to be easily reproduceable, which it is, other than an ugly step where I have to take a list of taxonIDs and put it into the site:

http://www.ncbi.nlm.nih.gov/Taxonomy/TaxIdentifier/tax_identifier.cgi

retrieve the output and carry on.

Does anyone know if I can get the code that is used? Clearly its calling a cgi from somewhere but I cant find it in the ncbi ftp. If not, is there an equivalent? Technically, I could pull apart the names.dmp and nodes.dmp but there is already a ncbi tool so I'm loathed to do that.

Thanks


N.B. The function is turning:

515482
515474

into

515482  |       Nitzschia dubiiformis   |       515482 2857 33852 33851 33850 33849 2836 33634 2759 131567
515474  |       Cocconeis stauroneiformis       |       515474 216715 216714 186023 33850 33849 2836 33634 2759 131567

EDIT: For the record, I have ~30,000 taxIDs that I am retrieving and gawbul's script, while elegant, won't complete. Im looking at Frederic's now for its multi-coring ability.


Viewing all articles
Browse latest Browse all 41826

Trending Articles