Quantcast
Channel: Post Feed
Viewing all articles
Browse latest Browse all 41826

Obtaining A Maximum Number Of Blast Hits: Problem...

$
0
0
I am having trouble getting blast to give me "correct" results. I am trying to retrieve as many hits with e-value better than 1. I query the database with a sequence that should have several thousand hits in the database. However, at best, using tblastn, I get more or less 1000 matches (~250 independent hits). I am at a loss understanding what is wrong with my command:tblastn -query protein.fasta -db nucl.blastdb -out results.tblastnout -evalue 1 -outfmt 7 -num_descriptions 100000 I get several matches within one same sequence hit, all with e-values better than 1. But what confuses me is that the best e-value of the worst hit (sorry if this is confusing ;) is nowhere near the -evalue limit, and is usually lower than 1E-60... Obviously, even including the redundancy of matches within a hit, I certainly do not reach the 100000 limit I asked for. So I have three questions:
  1. Is it possible to only list one (the best) "match" per "hit"?
  2. Any idea why I do not get a larger number of descriptions, considering that I expect to have close to 30000 positives in my database?
  3. Any comments/suggestions to improve my search?
EDIT: After investigation and testing (a bit)... For question 1) In order to limit the number of matches (HSPs) in a sequence hit, I tried the "Best-Hits filtering algorithm". As a result, this brought my newly found 50000 hits (with -max_target_seqs) down to 253. In reality however, it increased the number of p ...

Viewing all articles
Browse latest Browse all 41826

Trending Articles