Quantcast
Channel: Post Feed
Viewing all articles
Browse latest Browse all 41826

Make A Custom Blast Library Using The Output Of Another Blast Result

$
0
0

Hi Biostar,

I am working on a microbial gene annotation project and I am interested in taking a large number of sequences (say 20,000) and blasting them against the NR database. However, even with a local copy of nr and a decent computer this will take forever (days, anyway). My sequences, however, are not random so I wanted to make a subset of NR that will allow me to perform a much faster BLAST query.

My sequences are 454 reads of PCR amplicons made with highly degenerate primers targeting a family of proteins so what I would like to do is as follows:

  1. Blast a set of known sequences against nr.
  2. Take the hits from this query and use the hits to make a new blastdb
  3. Blast thousands of sequences against this smaller dataset and enjoy the massive speed gains.

So before reinventing the wheel, I was wondering if there is a trivially easy way to do this that someone knows about, or has done. In the meantime I will try writing a biopython script to run the blast and to use the 'gi' and 'sseqids' of the hits to make a new fasta file.

thanks, zach cp


Viewing all articles
Browse latest Browse all 41826

Trending Articles