Quantcast
Channel: Post Feed
Viewing all articles
Browse latest Browse all 41826

Correct Method To Blast All-Vs-All With Ncbiblast & How To Speed It Up?

$
0
0

Hi all,

I'm using ncbi-blast-2.2.24+ (on Ubuntu linux) for a sizable all-vs-all blast of protein sequences (530.000 lines of fasta). This is taking quite a while (over an hour) already, so I'm looking into ways to speed it up.

What I've done is run:

ncbi-blast-2.2.24+/bin/makeblastdb -in good_proteins.fasta -dbtype prot -out my_prot_blast_db

followed by:

ncbi-blast-2.2.24+/bin/blastp -db my_prot_blast_db -query good_proteins.fasta -outfmt 6 -out all-vs-all.tsv -num_threads 4

Now firstly: Is this the correct way to do an all-vs-all blast?

And secondly: How can I speed this up?
I added the -num_threads 4 in hopes of making it use all my four processing cores, but it's just alternating in using 100% of one CPU, with the other three near idle. (Being a CS graduate I'm aware of the distinction between cores & threads, but I didn't see any other configuration option that seemed related: http://www.ncbi.nlm.nih.gov/books/NBK1763/)

Possibly thirdly: It is at all reasonable to expect this all-vs-all blast on such a dataset to run in an manageable amount of time, or should I somehow divide this up / move to supercomputers?

(And maybe fourthy: I just chose ncbi-blast because I thought it'd be a good choice, would any other choice be better in handling this case?)

Best regards, Tim


Viewing all articles
Browse latest Browse all 41826

Trending Articles