Predict
The predict subcommand is used to predict resistance for a sample from an index.
At its simplest
drprg predict -i reads.fq -x mtb -o outdir
drprg is a bit "new-age" in that it assumes the reads are Nanopore. If they're
Illumina, use the -I/--illumina option.
See Prediction Output documentation for a detailed description of what results/output files and formats to expect.
Required
Index
The index is provided via the -x/--index option. It can either be a path to an index,
or the name of a downloaded index. As with the index
subcommmand, you can specify a version if you don't want to use the latest.
Input reads
A fastq (or fasta) file of the reads you want to predict resistance from - provided via
the -i/--input option. If you have paired reads in two files, simply combine them and
pass the combined file - interleave order doesn't matter. For example
cat r1.fq r2.fq > combined.fq
drprg predict -i combined.fq ...
gzip-compressed files are also accepted.
Optional
Sample name
Identifier to use for your output files. By default, it will be set to the file name
prefix (e.g. name for a fastq named name.fq.gz). Provided via the -s/--sample
option.
Minimum allele frequency
Provided via the -f/--maf option. If an alternate allele has at least this fraction of
the depth, a minor resistance ("r") prediction is made. By default, this is set to 1.0
for Nanopore data (i.e. minor allele detection is off) and 0.1 when using
the --illumina option. For example, if a variant is called as the reference allele for
Illumina reads, but an alternate allele has more than 10% of the depth on that position,
a minor resistance call is made for the alternate allele.
Ignore synonymous
Using the -S/--ignore-synonymous option will prevent synonymous mutations from
appearing as unknown resistance calls. However, any synonymous mutations in the
catalogue will still be considered.
Quick usage
$ drprg predict -h
Predict drug resistance
Usage: drprg predict [OPTIONS] --index <DIR> --input <FILE>
Options:
  -v, --verbose        Use verbose output
  -t, --threads <INT>  Maximum number of threads to use [default: 1]
  -h, --help           Print help (see more with '--help')
Input/Output:
  -x, --index <DIR>      Name of a downloaded index or path to an index
  -i, --input <FILE>     Reads to predict resistance from
  -o, --outdir <DIR>     Directory to place output [default: .]
  -s, --sample <SAMPLE>  Identifier to use for the sample
  -I, --illumina         Sample reads are from Illumina sequencing
Filter:
  -S, --ignore-synonymous     Ignore unknown (off-catalogue) variants that cause a synonymous substitution
  -f, --maf <FLOAT[0.0-1.0]>  Minimum allele frequency to call variants [default: 1]
Full usage
$ drprg predict --help
Predict drug resistance
Usage: drprg predict [OPTIONS] --index <DIR> --input <FILE>
Options:
  -p, --pandora <FILE>
          Path to pandora executable. Will try in src/ext or $PATH if not given
  -v, --verbose
          Use verbose output
  -m, --makeprg <FILE>
          Path to make_prg executable. Will try in src/ext or $PATH if not given
  -t, --threads <INT>
          Maximum number of threads to use
          Use 0 to select the number automatically
          [default: 1]
  -M, --mafft <FILE>
          Path to MAFFT executable. Will try in src/ext or $PATH if not given
  -h, --help
          Print help (see a summary with '-h')
Input/Output:
  -x, --index <DIR>
          Name of a downloaded index or path to an index
  -i, --input <FILE>
          Reads to predict resistance from
          Both fasta and fastq are accepted, along with compressed or uncompressed.
  -o, --outdir <DIR>
          Directory to place output
          [default: .]
  -s, --sample <SAMPLE>
          Identifier to use for the sample
          If not provided, this will be set to the input reads file path prefix
  -I, --illumina
          Sample reads are from Illumina sequencing
Filter:
  -S, --ignore-synonymous
          Ignore unknown (off-catalogue) variants that cause a synonymous substitution
  -f, --maf <FLOAT[0.0-1.0]>
          Minimum allele frequency to call variants
          If an alternate allele has at least this fraction of the depth, a minor resistance ("r") prediction is made. Set to 1 to disable. If --illumina is passed, the default is 0.1
          [default: 1]
      --debug
          Output debugging files. Mostly for development purposes
  -d, --min-covg <INT>
          Minimum depth of coverage allowed on variants
          [default: 3]
  -D, --max-covg <INT>
          Maximum depth of coverage allowed on variants
          [default: 2147483647]
  -b, --min-strand-bias <FLOAT>
          Minimum strand bias ratio allowed on variants
          For example, setting to 0.25 requires >=25% of total (allele) coverage on both strands for an allele.
          [default: 0.01]
  -g, --min-gt-conf <FLOAT>
          Minimum genotype confidence (GT_CONF) score allow on variants
          [default: 0]
  -L, --max-indel <INT>
          Maximum (absolute) length of insertions/deletions allowed
  -K, --min-frs <FLOAT>
          Minimum fraction of read support
          For example, setting to 0.9 requires >=90% of coverage for the variant to be on the called allele
          [default: 0]