[DOCKTESTERS] Summary of validation of Dockers Sanger and DKFZ(+Delly)

Junjun Zhang Junjun.Zhang at oicr.on.ca
Mon Jan 16 10:24:00 EST 2017


Hi,

I’d like to point out that this wiki page is obsolete: https://wiki.oicr.on.ca/display/PANCANCER/Preparing+paired-end+data+for+upload.

It has been replaced by the FINAL SOP for preparing PCAWG sequences here: https://wiki.oicr.on.ca/display/PANCANCER/PCAWG+%28a.k.a.+PCAP+or+PAWG%29+Sequence+Submission+SOP+-+v1.0. Many of us worked on it including Keiran.

Regarding preparation of test input BAM from merged / aligned BAM, the first step (Your starting files must be for a single lane of sequencing in either BAM (unaligned BAM) or FASTQ) of this section is most relevant: https://wiki.oicr.on.ca/display/PANCANCER/PCAWG+%28a.k.a.+PCAP+or+PAWG%29+Sequence+Submission+SOP+-+v1.0#PCAWG(a.k.a.PCAPorPAWG)SequenceSubmissionSOP-v1.0-5.Prepareyoursequencefiles

Hope this helps,

Junjun




Junjun Zhang


Lead bioinformatician


Informatics and Bio-computing

Ontario Institute for Cancer Research

MaRS Centre

661 University Avenue

Suite 510

Toronto, Ontario, Canada M5G 0A3


Tel: 416-673-8517

Mobile: 647-501-7511

Toll-free: 1-866-678-6427

Twitter:  @OICR_news


www.oicr.on.ca<http://www.oicr.on.ca/>


This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization.

From: <docktesters-bounces+junjun.zhang=oicr.on.ca at lists.icgc.org<mailto:docktesters-bounces+junjun.zhang=oicr.on.ca at lists.icgc.org>> on behalf of Christina Yung <christina.yung at oicr.on.ca<mailto:christina.yung at oicr.on.ca>>
Date: Monday, January 16, 2017 at 9:45 AM
To: Miguel Vazquez <miguel.vazquez at cnio.es<mailto:miguel.vazquez at cnio.es>>, Francis Ouellette <francis at oicr.on.ca<mailto:francis at oicr.on.ca>>
Cc: "docktesters at lists.icgc.org<mailto:docktesters at lists.icgc.org>" <docktesters at lists.icgc.org<mailto:docktesters at lists.icgc.org>>
Subject: Re: [DOCKTESTERS] Summary of validation of Dockers Sanger and DKFZ(+Delly)


Hi Folks,

Here are the instructions for preparing the data as input to BWA-Mem.  If you use a different protocol, please share.
<https://wiki.oicr.on.ca/display/PANCANCER/Preparing+paired-end+data+for+upload>https://wiki.oicr.on.ca/display/PANCANCER/Preparing+paired-end+data+for+upload

Christina

On 17-01-16 09:24 AM, Miguel Vazquez wrote:
Dear all,

Let me summarize the status of the testing for Sanger and DKFZ. The validation has been run for two donors for each workflow: DO50311 DO52140

Sanger:
----------

Sanger call only somatic variants. The results are identical for Indels and SVs but almost identical for SNV.MNV and CNV. The discrepancies are reproducible (on the same machine at least), i.e. the same are found after running the workflow a second time.

DKFZ:
---------
DKFZ cals somatic and germline variants, except germline CNVs. For both germline and somatic variants the results are identical for SNV.MNV and Indels but with large discrepancies for SV and CNV.


Kortine Kleinheinz and Joachim Weischenfeldt are in the process of investigating this issue I believe.

BWA-Mem failed for me and has also failed for Denis Yuen and Jonas Demeulemeester. Denis I believe is investigating this problem further. I haven't had the chance to investigate this much myself.

Best

Miguel




---------------------
RESULTS
---------------------

ubuntu at ip-10-253-35-14:~/DockerTest-Miguel$ cat results.txt

Comparison of somatic.snv.mnv for DO50311 using DKFZ
---
Common: 51087
Extra: 0
Missing: 0


Comparison of somatic.indel for DO50311 using DKFZ
---
Common: 26469
Extra: 0
Missing: 0


Comparison of somatic.sv<http://somatic.sv> for DO50311 using DKFZ
---
Common: 231
Extra: 44
    - Example: 10:20596800:N:<TRA>,10:56066821:N:<TRA>,11:16776092:N:<TRA>
Missing: 48
    - Example: 10:119704959:N:<INV>,10:13116322:N:<TRA>,10:47063485:N:<TRA>


Comparison of somatic.cnv for DO50311 using DKFZ
---
Common: 731
Extra: 213
    - Example: 10:132510034:N:<DEL>,10:20596801:N:<NEUTRAL>,10:47674883:N:<NEUTRAL>
Missing: 190
    - Example: 10:100891940:N:<NEUTRAL>,10:104975905:N:<NEUTRAL>,10:119704960:N:<NEUTRAL>


Comparison of germline.snv.mnv for DO50311 using DKFZ
---
Common: 3850992
Extra: 0
Missing: 0


Comparison of germline.indel for DO50311 using DKFZ
---
Common: 709060
Extra: 0
Missing: 0


Comparison of germline.sv<http://germline.sv> for DO50311 using DKFZ
---
Common: 1393
Extra: 231
    - Example: 10:134319313:N:<DEL>,10:134948976:N:<DEL>,10:19996638:N:<DEL>
Missing: 615
    - Example: 10:101851839:N:<TRA>,10:101851884:N:<TRA>,10:10745225:N:<DUP>

File not found /mnt/1TB/work/DockerTest-Miguel/tests/DKFZ/DO50311//output//DO50311.germline.cnv.vcf.gz

Comparison of somatic.snv.mnv for DO52140 using DKFZ
---
Common: 37160
Extra: 0
Missing: 0


Comparison of somatic.indel for DO52140 using DKFZ
---
Common: 19347
Extra: 0
Missing: 0


Comparison of somatic.sv<http://somatic.sv> for DO52140 using DKFZ
---
Common: 72
Extra: 23
    - Example: 10:132840774:N:<DEL>,11:38252019:N:<TRA>,11:47700673:N:<TRA>
Missing: 61
    - Example: 10:134749140:N:<DEL>,11:179191:N:<TRA>,11:38252005:N:<TRA>


Comparison of somatic.cnv for DO52140 using DKFZ
---
Common: 275
Extra: 94
    - Example: 1:106505931:N:<LOH>,1:109068899:N:<DEL>,1:109359995:N:<DEL>
Missing: 286
    - Example: 10:88653561:N:<LOH>,11:179192:N:<LOH>,11:38252006:N:<LOH>


Comparison of germline.snv.mnv for DO52140 using DKFZ
---
Common: 3833896
Extra: 0
Missing: 0


Comparison of germline.indel for DO52140 using DKFZ
---
Common: 706572
Extra: 0
Missing: 0


Comparison of germline.sv<http://germline.sv> for DO52140 using DKFZ
---
Common: 1108
Extra: 1116
    - Example: 10:102158308:N:<DEL>,10:104645247:N:<DEL>,10:105097522:N:<DEL>
Missing: 2908
    - Example: 10:100107032:N:<TRA>,10:100107151:N:<TRA>,10:102158345:N:<DEL>

File not found /mnt/1TB/work/DockerTest-Miguel/tests/DKFZ/DO52140//output//DO52140.germline.cnv.vcf.gz

Comparison of somatic.snv.mnv for DO50311 using Sanger
---
Common: 156299
Extra: 1
    - Example: Y:58885197:A:G
Missing: 14
    - Example: 1:102887902:A:T,1:143165228:C:G,16:87047601:A:C


Comparison of somatic.indel for DO50311 using Sanger
---
Common: 812487
Extra: 0
Missing: 0


Comparison of somatic.sv<http://somatic.sv> for DO50311 using Sanger
---
Common: 260
Extra: 0
Missing: 0


Comparison of somatic.cnv for DO50311 using Sanger
---
Common: 138
Extra: 0
Missing: 0


Comparison of somatic.snv.mnv for DO52140 using Sanger
---
Common: 87234
Extra: 5
    - Example: 1:23719098:A:G,12:43715930:T:A,20:4058335:T:A
Missing: 7
    - Example: 10:6881937:A:T,1:148579866:A:G,11:9271589:T:A


Comparison of somatic.indel for DO52140 using Sanger
---
Common: 803986
Extra: 0
Missing: 0


Comparison of somatic.sv<http://somatic.sv> for DO52140 using Sanger
---
Common: 6
Extra: 0
Missing: 0


Comparison of somatic.cnv for DO52140 using Sanger
---
Common: 36
Extra: 0
Missing: 2
    - Example: 10:11767915:T:<CNV>,10:11779907:G:<CNV>



This body part will be downloaded on demand.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.icgc.org/mailman/private/docktesters/attachments/20170116/9c42a8e5/attachment.html>


More information about the docktesters mailing list