Hi,
I am surprised to see as low number of posts about megablast indexing... Is this because it does not work? If I believe this one, this should really help to get results faster. But after some trials, I really cannot observe such a good improvement.
One potential problem is that the makembindex command results in creating one file less than it says in the output:
creating GG.00.idx
creating GG.01.idx
But only GG.00.idx appeared in the system files. (I tried with 2 computers with different processors with blast+.2.2.25 compiled independently on both machines.
First, I tried to megablast a file against Greengenes and except the fact it took the same time to run, the only difference was that the index megablast charged the RAM 6 to 7 times more than the non-index run. Despite of the potentially missing index file, the blast result was exactly the same (using the UNIX diff command). I made the assumption that indexing improves the speed only for bigger DBs:
So I tried against a huge db, i.e. genbank nt:
############ indexing db
makembindex -input nt -output nt -iformat blastdb
########################## megablast
### index
time blastn -task megablast -use_index true -db nt -query E1.454.fasta.1 -out megaBIGWithIndexNT.blast -evalue 1e-05 -num_descriptions 1 -num_alignments 1 -outfmt 6 > megaBIGWithIndexNT.out&
### without
time blastn -task m ...