Quantcast
Viewing all articles
Browse latest Browse all 41826

Exact matching with bowtie, BLAT and BLAST+

I am running bowtie with the following parameters, to look for up to, say, 10 exact matches of a 36-base nucleotide string to a GRCh37/hg19 index, e.g.:

$ bowtie -S hg19 -v 0 -k 10 -f sequence.fa > hits.sam

As a sanity check, I sample the 36 base sequence from the same assembly of hg19 (using the same FASTA files used to create the bowtie index) in order to verify that I receive all matches, using UCSC BLAT and NCBI BLAST searches as confirmation.

Some questions:

  1. The docs say that bowtie accepts read lengths with an upper bound of 1000 bp. In practice, what is the lower bound of query sequence lengths that it will accept and reliably align?

  2. Is the -oneOff parameter to the BLAT command-line tool used to limit mismatches to 0?

  3. Is there a way to translate accession code hits from an NCBI BLAST search to genomic coordinates (chromosome, start, stop)?

  4. Is there a parameter to limit BLAST+ command-line tool searches for hits that are the same length as the query sequence, or otherwise limit results to exact matches to the query sequence?


Viewing all articles
Browse latest Browse all 41826

Trending Articles