[DOCKTESTERS] Summary of validation of Dockers Sanger and DKFZ(+Delly)
Junjun Zhang
Junjun.Zhang at oicr.on.ca
Mon Jan 16 10:24:00 EST 2017
Hi,
I’d like to point out that this wiki page is obsolete: https://wiki.oicr.on.ca/display/PANCANCER/Preparing+paired-end+data+for+upload.
It has been replaced by the FINAL SOP for preparing PCAWG sequences here: https://wiki.oicr.on.ca/display/PANCANCER/PCAWG+%28a.k.a.+PCAP+or+PAWG%29+Sequence+Submission+SOP+-+v1.0. Many of us worked on it including Keiran.
Regarding preparation of test input BAM from merged / aligned BAM, the first step (Your starting files must be for a single lane of sequencing in either BAM (unaligned BAM) or FASTQ) of this section is most relevant: https://wiki.oicr.on.ca/display/PANCANCER/PCAWG+%28a.k.a.+PCAP+or+PAWG%29+Sequence+Submission+SOP+-+v1.0#PCAWG(a.k.a.PCAPorPAWG)SequenceSubmissionSOP-v1.0-5.Prepareyoursequencefiles
Hope this helps,
Junjun
Junjun Zhang
Lead bioinformatician
Informatics and Bio-computing
Ontario Institute for Cancer Research
MaRS Centre
661 University Avenue
Suite 510
Toronto, Ontario, Canada M5G 0A3
Tel: 416-673-8517
Mobile: 647-501-7511
Toll-free: 1-866-678-6427
Twitter: @OICR_news
www.oicr.on.ca<http://www.oicr.on.ca/>
This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization.
From: <docktesters-bounces+junjun.zhang=oicr.on.ca at lists.icgc.org<mailto:docktesters-bounces+junjun.zhang=oicr.on.ca at lists.icgc.org>> on behalf of Christina Yung <christina.yung at oicr.on.ca<mailto:christina.yung at oicr.on.ca>>
Date: Monday, January 16, 2017 at 9:45 AM
To: Miguel Vazquez <miguel.vazquez at cnio.es<mailto:miguel.vazquez at cnio.es>>, Francis Ouellette <francis at oicr.on.ca<mailto:francis at oicr.on.ca>>
Cc: "docktesters at lists.icgc.org<mailto:docktesters at lists.icgc.org>" <docktesters at lists.icgc.org<mailto:docktesters at lists.icgc.org>>
Subject: Re: [DOCKTESTERS] Summary of validation of Dockers Sanger and DKFZ(+Delly)
Hi Folks,
Here are the instructions for preparing the data as input to BWA-Mem. If you use a different protocol, please share.
<https://wiki.oicr.on.ca/display/PANCANCER/Preparing+paired-end+data+for+upload>https://wiki.oicr.on.ca/display/PANCANCER/Preparing+paired-end+data+for+upload
Christina
On 17-01-16 09:24 AM, Miguel Vazquez wrote:
Dear all,
Let me summarize the status of the testing for Sanger and DKFZ. The validation has been run for two donors for each workflow: DO50311 DO52140
Sanger:
----------
Sanger call only somatic variants. The results are identical for Indels and SVs but almost identical for SNV.MNV and CNV. The discrepancies are reproducible (on the same machine at least), i.e. the same are found after running the workflow a second time.
DKFZ:
---------
DKFZ cals somatic and germline variants, except germline CNVs. For both germline and somatic variants the results are identical for SNV.MNV and Indels but with large discrepancies for SV and CNV.
Kortine Kleinheinz and Joachim Weischenfeldt are in the process of investigating this issue I believe.
BWA-Mem failed for me and has also failed for Denis Yuen and Jonas Demeulemeester. Denis I believe is investigating this problem further. I haven't had the chance to investigate this much myself.
Best
Miguel
---------------------
RESULTS
---------------------
ubuntu at ip-10-253-35-14:~/DockerTest-Miguel$ cat results.txt
Comparison of somatic.snv.mnv for DO50311 using DKFZ
---
Common: 51087
Extra: 0
Missing: 0
Comparison of somatic.indel for DO50311 using DKFZ
---
Common: 26469
Extra: 0
Missing: 0
Comparison of somatic.sv<http://somatic.sv> for DO50311 using DKFZ
---
Common: 231
Extra: 44
- Example: 10:20596800:N:<TRA>,10:56066821:N:<TRA>,11:16776092:N:<TRA>
Missing: 48
- Example: 10:119704959:N:<INV>,10:13116322:N:<TRA>,10:47063485:N:<TRA>
Comparison of somatic.cnv for DO50311 using DKFZ
---
Common: 731
Extra: 213
- Example: 10:132510034:N:<DEL>,10:20596801:N:<NEUTRAL>,10:47674883:N:<NEUTRAL>
Missing: 190
- Example: 10:100891940:N:<NEUTRAL>,10:104975905:N:<NEUTRAL>,10:119704960:N:<NEUTRAL>
Comparison of germline.snv.mnv for DO50311 using DKFZ
---
Common: 3850992
Extra: 0
Missing: 0
Comparison of germline.indel for DO50311 using DKFZ
---
Common: 709060
Extra: 0
Missing: 0
Comparison of germline.sv<http://germline.sv> for DO50311 using DKFZ
---
Common: 1393
Extra: 231
- Example: 10:134319313:N:<DEL>,10:134948976:N:<DEL>,10:19996638:N:<DEL>
Missing: 615
- Example: 10:101851839:N:<TRA>,10:101851884:N:<TRA>,10:10745225:N:<DUP>
File not found /mnt/1TB/work/DockerTest-Miguel/tests/DKFZ/DO50311//output//DO50311.germline.cnv.vcf.gz
Comparison of somatic.snv.mnv for DO52140 using DKFZ
---
Common: 37160
Extra: 0
Missing: 0
Comparison of somatic.indel for DO52140 using DKFZ
---
Common: 19347
Extra: 0
Missing: 0
Comparison of somatic.sv<http://somatic.sv> for DO52140 using DKFZ
---
Common: 72
Extra: 23
- Example: 10:132840774:N:<DEL>,11:38252019:N:<TRA>,11:47700673:N:<TRA>
Missing: 61
- Example: 10:134749140:N:<DEL>,11:179191:N:<TRA>,11:38252005:N:<TRA>
Comparison of somatic.cnv for DO52140 using DKFZ
---
Common: 275
Extra: 94
- Example: 1:106505931:N:<LOH>,1:109068899:N:<DEL>,1:109359995:N:<DEL>
Missing: 286
- Example: 10:88653561:N:<LOH>,11:179192:N:<LOH>,11:38252006:N:<LOH>
Comparison of germline.snv.mnv for DO52140 using DKFZ
---
Common: 3833896
Extra: 0
Missing: 0
Comparison of germline.indel for DO52140 using DKFZ
---
Common: 706572
Extra: 0
Missing: 0
Comparison of germline.sv<http://germline.sv> for DO52140 using DKFZ
---
Common: 1108
Extra: 1116
- Example: 10:102158308:N:<DEL>,10:104645247:N:<DEL>,10:105097522:N:<DEL>
Missing: 2908
- Example: 10:100107032:N:<TRA>,10:100107151:N:<TRA>,10:102158345:N:<DEL>
File not found /mnt/1TB/work/DockerTest-Miguel/tests/DKFZ/DO52140//output//DO52140.germline.cnv.vcf.gz
Comparison of somatic.snv.mnv for DO50311 using Sanger
---
Common: 156299
Extra: 1
- Example: Y:58885197:A:G
Missing: 14
- Example: 1:102887902:A:T,1:143165228:C:G,16:87047601:A:C
Comparison of somatic.indel for DO50311 using Sanger
---
Common: 812487
Extra: 0
Missing: 0
Comparison of somatic.sv<http://somatic.sv> for DO50311 using Sanger
---
Common: 260
Extra: 0
Missing: 0
Comparison of somatic.cnv for DO50311 using Sanger
---
Common: 138
Extra: 0
Missing: 0
Comparison of somatic.snv.mnv for DO52140 using Sanger
---
Common: 87234
Extra: 5
- Example: 1:23719098:A:G,12:43715930:T:A,20:4058335:T:A
Missing: 7
- Example: 10:6881937:A:T,1:148579866:A:G,11:9271589:T:A
Comparison of somatic.indel for DO52140 using Sanger
---
Common: 803986
Extra: 0
Missing: 0
Comparison of somatic.sv<http://somatic.sv> for DO52140 using Sanger
---
Common: 6
Extra: 0
Missing: 0
Comparison of somatic.cnv for DO52140 using Sanger
---
Common: 36
Extra: 0
Missing: 2
- Example: 10:11767915:T:<CNV>,10:11779907:G:<CNV>
This body part will be downloaded on demand.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.icgc.org/mailman/private/docktesters/attachments/20170116/9c42a8e5/attachment.html>
More information about the docktesters
mailing list