Quantcast
Channel: Post Feed
Viewing all articles
Browse latest Browse all 41826

parsing too large blast result with bioperl OR other methods?

$
0
0
Hi all, Recently I am dealing with bunch of genes to design the appropriate primers. However, it is still hard for me to obtain the homology information of the primers. For example, I need to design a pair primers for one exon of the gene. I firstly get all possible primers with predefined length, e.g. 18-30 bps, and then use blastall -p blastn (or megablast) with -e 1 -W 8 to determine whether the primers have homogenous seqs. However, for those >10000 primers, the blast out file was larger than 200M, which requires longer time to parse using Bio::SearchIO module. And sometimes even crash the memory. Moreover, blasting those primer seqs within 18-30 bps are danger because shorter seqs will sometimes fail due to unkown reasons. Another method is to blast the whole exon regions with parameter -e 0.1 -W 11, however, it will generate huge output and it will take long time to parse the blast file, and to determine whether the primer region falls into homologous part.   Till now, I have not obtained any good method to fix such problem. If anyone experienced such issue, can you plz tell me how?   Thanks.   ##############################2014.9.2 Although we could firstly define those nts belong to repeat regions using repeatMasker, and then use -F parameter in blast to neglect these reigons, those repeat regions, however, will sometimes do not share too much homologous sequences. This is the method that I can find now, but is not perfect. Hope someone could provid ...

Viewing all articles
Browse latest Browse all 41826

Trending Articles