Quantcast
Viewing all articles
Browse latest Browse all 41826

Using Biopython and BLAST+ to automate de novo viral contig sorting

Hello all, I need some assistance in programing a script to run a basic nt BLAST on contigs and then save the requested files in a output file (csv) for easy sorting. I have an understanding of python but do not practice enough to be very good at it and since I spend a large portion of my time doing this simple task I am trying to speed it up. I have a test file of 3 contigs in basic fasta fomat with the ">" as the contig id followed by the sequence I would like to have my script work by reading the file into a dict and then passing the key and value pairs to the blast+ wrapper (found it expects each file to be one search) and then save to the output file and append the next search result until done with the file. This is my script so far... import sys from Bio import SeqIO from Bio.Blast.Applications import NcbitblastnCommandline #varable input files file_of_seq = sys.argv[1] #make a dictinary for the sequences to be ran using BLAST record_dict = SeqIO.to_dict(SeqIO.parse(file_of_seq,"fasta") len(record_dict) The issue I'm running into is when I run this script in command line it doesn't return the len of the dictinary it just returns an sytax error ~$ python blast.py test_query.fasta File "blast.py", line 10 len(record_dict) ^ I do not know why it doesn't return 3 to the console since that is how many key,val pairs it should be making.  So I 'm not really sure I'm using Biopython correctly in making the dictinary object. Any advise would be apprec ...

Viewing all articles
Browse latest Browse all 41826

Trending Articles