Stand-alone utility for filtering a BAM file to specific genomic regions, using Apache Spark.
sbt assembly
spark-submit \
--properties-file <spark properties file> \
target/scala-2.11/filter-bam-assembly-1.0.0-SNAPSHOT.jar \
in.bam \
<regions> \
out.bam \
[-c] [--include-unmapped-mates]
<regions>
: comma-separated list of genomic regions, in the format accepted byhammerlab/genomic-loci
.-c
/--count
: print the number of reads output, in addition to writingout.bam
.--include-unmapped-mates
: include unmapped reads whose mate-contig and mate-start are set to a value that overlaps<regions>
.