Quantcast
Channel: Post Feed
Viewing all articles
Browse latest Browse all 41826

How To Extract Desire Genes Blast Xml Result From A Big Blast Xml File

$
0
0
Dear community, I have a XML file contained 50,000 genes Blast result with 10 hits for each gene. I want to mine desire genes Blast result from that XML file and the output file is still in XML format. The output file should have a full Blast result in XML form of desire genes. I tired with the Python script shared by juefish on this posts, however, I only get partial of one gene Blast XML result parsed instead a list of my desire genes. I quote juefish's Python script here to make the discussion easier.#!/usr/bin/env python import sys import os import sets import Bio from sets import Set from Bio.Blast import NCBIXML # Usage. if len(sys.argv) < 2: print "" print "This program extracts blast results from an xml file given a list of query sequences" print "Usage: %s -list file1 -xml file2 > outfile" print "-list: list of sequence names" print "-xml: blast xml output file" print "" sys.exit() # Parse args. for i in range(len(sys.argv)): if sys.argv[i] == "-list": infile1 = sys.argv[i+1] elif sys.argv[i] == "-xml": infile2 = sys.argv[i+1] fls = [infile1,infile2] results_handle = open(fls[1], "r") fin1 = open(fls[0],"r") geneContigs = Set([]) #establish list of names of queries to extract from xml file for line in fin1: temp=line.lstrip('>').split() ...

Viewing all articles
Browse latest Browse all 41826

Trending Articles