I'm trying to learn RECON and am experimenting using chr22. My steps so far, roughly:
- Make blast database from chr22.fa
- BLAST chr22.fa against its own database
- Run MSPCollect.pl (RECON provided script) to create an MSP file
Run recon.pl on the MSP file and a list of sequence IDs
However, blasting a sequence against its own database takes a prohibitively long time or results in 100% self hits. If I remove the self hits, I'm left with a bunch of alignments that RECON is then happy to work with, but I had to write my own Python script to filter those out.
By now this is all feeling very convoluted, so my question is: Am I even close to doing the BLAST part correctly? The RECON home page says to avoid self hits for performance sake, but I haven't been able to discover how. Any other glaring mistakes?