#Program parameters and options
-vcf [file]
File path of the isolate vcf. Assume all
variants are PASS
in the QUAL
column, the VCF
file also reqires the AD
field.
Note: In the current implementation, DEploid
only take
the first sample in the VCF file. DEploid
DO NOT handle
multi-allelic variants, nor indels. The FILTER
column will
not be used.
-plaf [file]
File path of population level allele
frequencies (tab-delimited plain text file), for example
CHROM | POS | PLAF |
---|---|---|
Pf3D7_01_v3 | 93157 | 0.0190612159917058 |
Pf3D7_01_v3 | 94422 | 0.135502358766423 |
Pf3D7_01_v3 | 94459 | 0.156294363760064 |
Pf3D7_01_v3 | 94487 | 0.143439298925837 |
-panel [file]
File path of the reference panel
(tab-delimited plain text file), for example
CHROM | POS | 3D7 | Dd2 | Hb3 | 7G8 |
---|---|---|---|---|---|
Pf3D7_01_v3 | 93157 | 0 | 0 | 0 | 1 |
Pf3D7_01_v3 | 94422 | 0 | 0 | 0 | 1 |
Pf3D7_01_v3 | 94459 | 0 | 0 | 0 | 1 |
Pf3D7_01_v3 | 94487 | 0 | 0 | 0 | 1 |
-noPanel
Use population level allele frequency as
prior.
Warning: Flags -panel
and -noPanel
should
not be used together.
‘-exclude [file]’ File path of sites to be excluded (tab-delimited plain text file).
‘-o [string]’ Specify the file name prefix of the output.
‘-k [int]’ Number of strain (default value 5).
‘-nSample [int]’ Number of MCMC samples (default value 800).
‘-rate [int]’ MCMC sample rate (default value 5).
‘-burn [float]’ MCMC burn rate (default value 0.5).
‘-v , -version’ DEploid version.
‘-vcfOut’ Save final halpotypes into a VCF file.
‘-ref [file] -alt [file]’ File path of reference and alternative allele count (tab-delimited plain text file).
Note: In early dEploid
versions (prior to
v0.2-release
), allele counts extracted from the vcf file
are placed in two files, and parsed by flags -ref [file]
and -alt [file]
. Tab-delimited plain text for input. First
and second columns record chromosome and position labels respectively.
Third columns records the reference allele count or alternative allele
count. For example,
CHROM | POS | PG0390.C |
---|---|---|
Pf3D7_01_v3 | 93157 | 85 |
Pf3D7_01_v3 | 94422 | 77 |
Pf3D7_01_v3 | 94459 | 90 |
Pf3D7_01_v3 | 94487 | 79 |
CHROM | POS | PG0390.C |
---|---|---|
Pf3D7_01_v3 | 93157 | 0 |
Pf3D7_01_v3 | 94422 | 0 |
Pf3D7_01_v3 | 94459 | 0 |
Pf3D7_01_v3 | 94487 | 0 |
Warning: Flags -ref
and -alt
should not be
used with -vcf
.
-forbidUpdateProp
Forbid MCMC moves to update
proportions.
-forbidUpdateSingle
Forbid MCMC moves to update
single haplotype.
-forbidUpdatePair
Forbid MCMC moves to update pair
haplotypes.
-exportPostProb
Save the posterior probabilities of
the final iteration of all strains.
-miss [float]
Miss copying probability
-recomb [float]
Constant recombination
probability
-initialP [float ...]
Initialize
proportions.
-p [int]
Output precision (default value
8).
Data exploration, plot the read count ALT
vs
REF
.
library(DEploid)
vcfFile = system.file("extdata", "PG0390-C.test.vcf.gz", package = "DEploid")
PG0390 = extractCoverageFromVcf(vcfFile)
plotAltVsRef( PG0390$refCount, PG0390$altCount )
Plot the histogram of the observed allele frequency within sample.
Load prior information: PLAF and reference panel
plafFile = system.file("extdata", "labStrains.test.PLAF.txt", package = "DEploid")
plaf = extractPLAF(plafFile)
panelFile = system.file("extdata", "labStrains.test.panel.txt", package = "DEploid")
Deconvolute the haplotypes
set.seed(1)
PG0390.deconv = dEploid(paste("-vcf", vcfFile, "-plaf", plafFile, "-panel", panelFile))
prop = PG0390.deconv$Proportions[dim(PG0390.deconv$Proportions)[1],]
expWSAF = t(PG0390.deconv$Haps) %*% prop
Plot the allele frequency within sample (observed in red, expected in blue) against the population level allele frequency.
Plot the history of the MCMC proportion estimates.
Plot the allele frequency within sample, observed against expected.
If you encounter any problem when using dEploid
, please
file a short bug report by using the issue tracker on
GitHub or email joe.zhu (at) well.ox.ac.uk.
Please include the output of dEploid -v
and the platform
you are using dEploid
on in the report. If the problem
occurs while executing dEploid
, please also include the
command you are using and the random seed.
Thank you!
If you use dEploid
with the flag -ibd
,
please cite the following paper:
Zhu, J. S., J. A. Hendry, J. Almagro-Garcia, R. D. Pearson, R. Amato, A. Miles, D. J. Weiss, T. C. D. Lucas, M. Nguyen, P. W. Gething, D. Kwiatkowski, G. McVean, and for the Pf3k Project. (2018) The origins and relatedness structure of mixed infections vary with local prevalence of P. falciparum malaria. eLife, 40845, doi: https://doi.org/10.7554/eLife.40845.
If you use dEploid
in your work, please cite the
program:
Zhu, J. S., J. A. Garcia, G. McVean. (2018) Deconvolution of multiple infections in Plasmodium falciparum from high throughput sequencing data. Bioinformatics 34(1), 9-15. doi: https://doi.org/10.1093/bioinformatics/btx530.
Bibtex record::
@article {Zhu387266, author = {Zhu, Sha Joe and Hendry, Jason A. and Almagro-Garcia, Jacob and Pearson, Richard D. and Amato, Roberto and Miles, Alistair and Weiss, Daniel J. and Lucas, Tim C.D. and Nguyen, Michele and Gething, Peter W. and Kwiatkowski, Dominic and McVean, Gil and ,}, title = {The origins and relatedness structure of mixed infections vary with local prevalence of P. falciparum malaria}, year = {2018}, doi = {10.1101/387266}, publisher = {Cold Spring Harbor Laboratory}, URL = {https://www.biorxiv.org/content/10.1101/387266v1}, eprint = {https://www.biorxiv.org/content/biorxiv/early/2018/08/09/387266.full.pdf}, journal = {bioRxiv} }
@article {Zhubtx530, author = {Zhu, Sha Joe and Almagro-Garcia, Jacob and McVean, Gil}, title = {Deconvolution of multiple infections in {{}} from high throughput sequencing data}, year = {2017}, doi = {10.1093/bioinformatics/btx530}, URL = {https://doi.org/10.1093/bioinformatics/btx530}, journal = {Bioinformatics} }