Hi all,
Recently I am dealing with bunch of genes to design the appropriate primers.
However, it is still hard for me to obtain the homology information of the primers.
For example, I need to design a pair primers for one exon of the gene.
I firstly get all possible primers with predefined length, e.g. 18-30 bps, and then use blastall -p blastn (or megablast) with -e 1 -W 8 to determine whether the primers have homogenous seqs. However, for those >10000 primers, the blast out file was larger than 200M, which requires longer time to parse using Bio::SearchIO module. And sometimes even crash the memory. Moreover, blasting those primer seqs within 18-30 bps are danger because shorter seqs will sometimes fail due to unkown reasons.
Another method is to blast the whole exon regions with parameter -e 0.1 -W 11, however, it will generate huge output and it will take long time to parse the blast file, and to determine whether the primer region falls into homologous part.
Till now, I have not obtained any good method to fix such problem.
If anyone experienced such issue, can you plz tell me how?
Thanks.
##############################2014.9.2
Although we could firstly define those nts belong to repeat regions using repeatMasker,
and then use -F parameter in blast to neglect these reigons, those repeat regions, however, will sometimes do not share too much homologous sequences.
This is the method that I can find now, but is not perfect.
Hope someone could provid ...
↧