Dear all,
I got a question about using MEGAN4 to parsing SAM file.
What I want to do is to get taxonomic and functional annotaion of my raw reads against nr database . As the raw reads is too big (11 million reads in total, 100bp long each) for direct blast against nr database. So I took an approach first assemble my reads into ORFs which I could got blast result easily and then aligned my reads to ORF. Then I want to use MEGAN to parse the alignment of reads to ORF thus get the annotation of raw reads.
Here is what I did exactly:
-I first assembly the reads into contigs
- then use MetaGeneMark to find open reading frames (ORFs) whose size is suitable to blast against nr database.
- blast ORFs against nr database
- import the ORF blast result into MEGAN using default parameters and successfully get the rma file
- use the Export-Assignments To CSV funtion of MEGAN4 to generate a synomous file which contains two colums (tab seperated): the first one is the name of ORF and second column is the taxonomy ID
- use bowtie align my raw reads to ORFs and get the SAM file that I want to parse