Predict
The predict
subcommand is used to predict resistance for a sample from an index.
At its simplest
drprg predict -i reads.fq -x mtb -o outdir
drprg
is a bit "new-age" in that it assumes the reads are Nanopore. If they're
Illumina, use the -I/--illumina
option.
See Prediction Output documentation for a detailed description of what results/output files and formats to expect.
Required
Index
The index is provided via the -x/--index
option. It can either be a path to an index,
or the name of a downloaded index. As with the index
subcommmand, you can specify a version if you don't want to use the latest.
Input reads
A fastq (or fasta) file of the reads you want to predict resistance from - provided via
the -i/--input
option. If you have paired reads in two files, simply combine them and
pass the combined file - interleave order doesn't matter. For example
cat r1.fq r2.fq > combined.fq
drprg predict -i combined.fq ...
gzip
-compressed files are also accepted.
Optional
Sample name
Identifier to use for your output files. By default, it will be set to the file name
prefix (e.g. name
for a fastq named name.fq.gz
). Provided via the -s/--sample
option.
Minimum allele frequency
Provided via the -f/--maf
option. If an alternate allele has at least this fraction of
the depth, a minor resistance ("r") prediction is made. By default, this is set to 1.0
for Nanopore data (i.e. minor allele detection is off) and 0.1
when using
the --illumina
option. For example, if a variant is called as the reference allele for
Illumina reads, but an alternate allele has more than 10% of the depth on that position,
a minor resistance call is made for the alternate allele.
Ignore synonymous
Using the -S/--ignore-synonymous
option will prevent synonymous mutations from
appearing as unknown resistance calls. However, any synonymous mutations in the
catalogue will still be considered.
Quick usage
$ drprg predict -h
Predict drug resistance
Usage: drprg predict [OPTIONS] --index <DIR> --input <FILE>
Options:
-v, --verbose Use verbose output
-t, --threads <INT> Maximum number of threads to use [default: 1]
-h, --help Print help (see more with '--help')
Input/Output:
-x, --index <DIR> Name of a downloaded index or path to an index
-i, --input <FILE> Reads to predict resistance from
-o, --outdir <DIR> Directory to place output [default: .]
-s, --sample <SAMPLE> Identifier to use for the sample
-I, --illumina Sample reads are from Illumina sequencing
Filter:
-S, --ignore-synonymous Ignore unknown (off-catalogue) variants that cause a synonymous substitution
-f, --maf <FLOAT[0.0-1.0]> Minimum allele frequency to call variants [default: 1]
Full usage
$ drprg predict --help
Predict drug resistance
Usage: drprg predict [OPTIONS] --index <DIR> --input <FILE>
Options:
-p, --pandora <FILE>
Path to pandora executable. Will try in src/ext or $PATH if not given
-v, --verbose
Use verbose output
-m, --makeprg <FILE>
Path to make_prg executable. Will try in src/ext or $PATH if not given
-t, --threads <INT>
Maximum number of threads to use
Use 0 to select the number automatically
[default: 1]
-M, --mafft <FILE>
Path to MAFFT executable. Will try in src/ext or $PATH if not given
-h, --help
Print help (see a summary with '-h')
Input/Output:
-x, --index <DIR>
Name of a downloaded index or path to an index
-i, --input <FILE>
Reads to predict resistance from
Both fasta and fastq are accepted, along with compressed or uncompressed.
-o, --outdir <DIR>
Directory to place output
[default: .]
-s, --sample <SAMPLE>
Identifier to use for the sample
If not provided, this will be set to the input reads file path prefix
-I, --illumina
Sample reads are from Illumina sequencing
Filter:
-S, --ignore-synonymous
Ignore unknown (off-catalogue) variants that cause a synonymous substitution
-f, --maf <FLOAT[0.0-1.0]>
Minimum allele frequency to call variants
If an alternate allele has at least this fraction of the depth, a minor resistance ("r") prediction is made. Set to 1 to disable. If --illumina is passed, the default is 0.1
[default: 1]
--debug
Output debugging files. Mostly for development purposes
-d, --min-covg <INT>
Minimum depth of coverage allowed on variants
[default: 3]
-D, --max-covg <INT>
Maximum depth of coverage allowed on variants
[default: 2147483647]
-b, --min-strand-bias <FLOAT>
Minimum strand bias ratio allowed on variants
For example, setting to 0.25 requires >=25% of total (allele) coverage on both strands for an allele.
[default: 0.01]
-g, --min-gt-conf <FLOAT>
Minimum genotype confidence (GT_CONF) score allow on variants
[default: 0]
-L, --max-indel <INT>
Maximum (absolute) length of insertions/deletions allowed
-K, --min-frs <FLOAT>
Minimum fraction of read support
For example, setting to 0.9 requires >=90% of coverage for the variant to be on the called allele
[default: 0]