Hi, I have a DNA sequence ( it's about 388 bp), which I am comparing with GenBank sequences using Blastx. I understand that Blastx looks into all possible 6 reading frames when translating a DNA seq, but the outcome is puzzling me because it is showing that 3 different reading frames show similarity to the same protein (it's in a conserved region of a Peptidase M1 superfamily). Also, when I look closely at the alignments, the similarities ( in the 3 frames) occur within the same region. The similarity is approx 76% of maximum identity and an E-value of 2e-11 .
Is this "similarity" of my sequence, most likely due to chance?
There are 2 things that make me think this:
1) I am aware that my sequence is too short compared to the >1000bp of the M1 peptidase sequence in GenBank.
2) When I look at the reading frames of my translated sequence, there are stop codons spread throughout... or can this be due to errors in sequencing?
Thanks for any help!