I am seemingly stuck with something that should be very simple and I hope I haven't overlooked something obvious.
Question: How can I make a valid Blast-database with Taxids from a NCBI query export?
What I have tried so far:
For a meta-genomics project I need a custom made blast database which I wish to generate from the result of the following NCBI Nucleotide query:
Viruses[Organism] AND srcdb_refseq[PROP] NOT cellular organisms [ORGN]
The result is 3986 entries which I exported and saved (via 'Send to') in FASTA and ASN1 format. (Both files are seemingly containing the right amount of entries) As this is a meta-genomics project I would love to have the taxon ids in the blastdb.
I was successful with making a valid blast database from the FASTA file using makeblastdb, but the FASTA header doesn't include taxids, hence I tried to make a blast database from the ASN1 export using the following command (it is not clear from the documentation which formats can be used to create the database): $ makeblastdb -in AllViralDNARefSeq.asn1 -dbtype nucl -out ViralASN1 -title "All Viral RefSeq DNA from NCBI ASN1"
Building a new DB, current time: 12/20/2011 10:37:28
New DB name: ViralASN1
New DB title: All Viral RefSeq DNA from NCBI ASN1
Sequence type: Nucleotide
Keep Linkouts: T
Keep MBits: T
Maximum fi ...