Quantcast
Channel: Post Feed
Viewing all articles
Browse latest Browse all 41826

Merging Blastx Hits From Overlapping Bacterial Genome Segments

$
0
0

I blastx-ed 1Mbp bacterial genome fragment against NCBI nr database. I have split it into 2000bp fragments with 500bp overlap into a one multiple fasta file (splitter from EMBOSS)

splitter -sequence my_contig.fa  -size 2000 -overlap 500

As on output I picked tabulated blast (-m 9).

Next step was to convert blastx output into gff3. Got that one, with absolute positions (positions in intact contig).

Seems that often one ORF / predicted gene is covered by 2-3 blast hits to the same protein. Hits may or may not overlap. Hence my questions:

  1. what are the fragment sizes / overlaps typically used for blastx in such situation?
  2. are there any advantages of improving blast hits, by say merging overlapping segments (e-scores will be invalid), or by using blast2 (blastx mode) and comparing DNA sequence from region of overlapping/almost-touching hits against already detected protein?

Viewing all articles
Browse latest Browse all 41826

Trending Articles