Quantcast
Viewing all articles
Browse latest Browse all 41826

Protein sequence alignment using Blast

Hi All,

I am trying to run Blast over protein sequences from two organisms. I downloaded the fasta from NCBI. I am trying to iterate over the list of sequences in the fasta file and do a sequence alignment of each sequence in one file with each sequence in the other. I want to run over local Blast but getting some error, I would greatly appreciate some suggestions.

'''from Bio.Blast.Applications import NcbiblastxCommandline
help(NcbiblastxCommandline)'''

from Bio.Blast.Applications import NcbiblastpCommandline
from StringIO import StringIO
from Bio.Blast import NCBIXML
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord
from Bio import SeqIO
from Bio.Blast import NCBIWWW
import cStringIO

def BlastSeq():
    SC_Fasta = open("sc.fsa","r")
    HS_Fasta = open("hsap.fsa","r")
    blastp = "C:\\Program Files\\NCBI\\blast-2.2.29+\\bin\\blastp"

    record1 = list(SeqIO.parse(SC_Fasta,"fasta"))
    for r1 in record1:
        r1.id
        r1.seq


    record2 = list(SeqIO.parse(HS_Fasta,"fasta"))
    for r2 in record2:
        r2.id
        r2.seq

    for r1 in record1:
        for r2 in record2:            
            output = NcbiblastpCommandline(blastp,query= r1.seq, subject=r2.seq, outfmt=5)()[0]
            blast_result_record = NCBIXML.read(StringIO(output))                   


def main():
    BlastSeq()

main()

Error: Bio.Application.ApplicationError: Command 'C:\Program Files\NCBI\blast-2.2.29+\bin\blastp -outfmt 5 -query    MVKLTSIAAGVAAIAATASATTTLAQSDERVNLVELGVYVSDIRAHLAQYYMFQAAHPTETYPVEVAEAVFNYGDF -subjectHGLQELKAELDAAVLKATGRQILTLRVRLAGAQLSWLYKEATVQEVDVIPEDGAADVRVIISNSAYGKFRKLFPG' returned non-zero exit status -1073741515

I understand that each seq should be passed as individual fasta file,but I don't understand how to proceed.


Viewing all articles
Browse latest Browse all 41826

Trending Articles