[DOCKTESTERS] Summary of validation of Dockers Sanger and DKFZ(+Delly)

Schlesner, Matthias m.schlesner at Dkfz-Heidelberg.de
Mon Feb 13 09:04:21 EST 2017


Hi Junjun,

Please find attached some QC for DO51087. Except the very strong oxoG it
has no striking problems. Some GC bias, but if we exclude on this
something like 30% of the cohort would be out. And oxoG is well controlled
by the Broad filter also in such strong cases. I think this donor could
stay in.

Best,
Matthias


Dr. Matthias Schlesner
Division Theoretical Bioinformatics (B080)
Head of Computational Oncology Group
German Cancer Research Center (DKFZ)
Foundation under Public Law
Im Neuenheimer Feld 280
69120 Heidelberg
Germany
office: Berliner Str. 41 (Mathematikon), room 02.MB.116
phone: +49 6221 42-2720
fax:      +49 6221 42-3626
m.schlesner at dkfz.d <mailto:m.schlesner at dkfz.de>e
www.dkfz.de <http://www.dkfz.de/>
 
Management Board: Prof. Dr. Michael Baumann, Prof. Dr. Josef Puchta
VAT-ID No.: DE143293537





On 2/6/17, 4:24 PM, "Junjun Zhang" <Junjun.Zhang at oicr.on.ca> wrote:

>Hi Matthias,
>
>Thanks for clarification.
>
>We had discussion about this donor (DO51087) during today¹s tech call.
>Lincoln suggests to ask you to help viewing the QC metrics of this donor,
>if appropriate, this donor may be added to the exclusion list.
>
>Just an FYI, you can find more information about this donor here:
>https://docs.google.com/spreadsheets/d/126V4Dke1IvfVZqHLvZPUUeo7PO1Hi8jg8Q
>h
>xiIDOFR4/edit#gid=1654136615 (search for DO51087).
>
>Can you please help with this?
>
>Thanks,
>Junjun
>
>
>
>
>On 2017-02-06, 1:28 AM, "Schlesner, Matthias"
><m.schlesner at Dkfz-Heidelberg.de> wrote:
>
>>Hi Junjun,
>>
>>This sample has extreme OxoG which could  not be removed completely by
>>our filter. Hence there is a huge number of artifacts remaining which
>>blow up the file size.
>>
>>Best,
>>Matthias
>>
>>
>>Dr. Matthias Schlesner
>>Division Theoretical Bioinformatics (B080)
>>Head of Computational Oncology Group
>>
>>German Cancer Research Center (DKFZ)
>>Foundation under Public Law
>>Im Neuenheimer Feld 280
>>69120 Heidelberg
>>Germany
>>office: Berliner Str. 41 (Mathematikon), room 02.MB.116
>>phone: +49 6221 42-2720
>>fax:      +49 6221 42-3626
>>
>>m.schlesner at dkfz.d<mailto:m.schlesner at dkfz.de>e
>>www.dkfz.de<http://www.dkfz.de/>
>>
>> [unknown.png]
>>
>>Management Board: Prof. Dr. Michael Baumann, Prof. Dr. Josef Puchta
>>
>>VAT-ID No.: DE143293537
>>
>>From: Junjun Zhang
>><Junjun.Zhang at oicr.on.ca<mailto:Junjun.Zhang at oicr.on.ca>>
>>Date: Monday, February 6, 2017 at 12:52 AM
>>To: Jonas Demeulemeester
>><Jonas.Demeulemeester at crick.ac.uk<mailto:Jonas.Demeulemeester at crick.ac.uk
>>>
>>>, Miguel Vazquez <miguel.vazquez at cnio.es<mailto:miguel.vazquez at cnio.es>>
>>Cc: "docktesters at lists.icgc.org<mailto:docktesters at lists.icgc.org>"
>><docktesters at lists.icgc.org<mailto:docktesters at lists.icgc.org>>,
>>"joachim.weischenfeldt at bric.ku.dk<mailto:joachim.weischenfeldt at bric.ku.dk
>>>
>>" 
>><joachim.weischenfeldt at bric.ku.dk<mailto:joachim.weischenfeldt at bric.ku.dk
>>>
>>>, "Schlesner, Matthias"
>>><m.schlesner at Dkfz-Heidelberg.de<mailto:m.schlesner at Dkfz-Heidelberg.de>>
>>Subject: Re: [DOCKTESTERS] Summary of validation of Dockers Sanger and
>>DKFZ(+Delly)
>>
>>Hi Miguel and Jonas,
>>
>>I hope DKFZ pipeline authors (cc¹d here) would be able to figure out the
>>differences of the calls for DO52140
>>
>>Here I have another donor: DO51087. The size of the somatic SNV/MNV DKFZ
>>call seems to be surprisingly large, the GZ¹d VCF file is greater than
>>500MB. Here you can find more information about the file:
>>https://dcc.icgc.org/repositories/files/FI500885. You can verify that in
>>GNOS as well: 
>>https://gtrepo-dkfz.annailabs.com/cghub/metadata/analysisFull/e1e9062e-35
>>e
>>6-447d-bc61-591e76fbeee0.
>>
>>Matthias, can you please take a look of the VCF? Hope you may be able to
>>spot something abnormal there.
>>
>>Maybe Miguel/Jonas, if you plan to test more donors for the DKFZ
>>pipeline, can you please choose this donor? Tumour aligned BAM is:
>>https://dcc.icgc.org/repositories/files/FI37278, normal aligned BAM is
>>https://dcc.icgc.org/repositories/files/FI37277
>>
>>Thanks,
>>Junjun
>>
>>
>>
>>From: 
>><docktesters-bounces+junjun.zhang=oicr.on.ca at lists.icgc.org<mailto:dockte
>>s
>>ters-bounces+junjun.zhang=oicr.on.ca at lists.icgc.org>> on behalf of Jonas
>>Demeulemeester 
>><Jonas.Demeulemeester at crick.ac.uk<mailto:Jonas.Demeulemeester at crick.ac.uk
>>>
>>>
>>Date: Saturday, February 4, 2017 at 10:14 AM
>>To: Miguel Vazquez
>><miguel.vazquez at cnio.es<mailto:miguel.vazquez at cnio.es>>
>>Cc: "docktesters at lists.icgc.org<mailto:docktesters at lists.icgc.org>"
>><docktesters at lists.icgc.org<mailto:docktesters at lists.icgc.org>>
>>Subject: Re: [DOCKTESTERS] Summary of validation of Dockers Sanger and
>>DKFZ(+Delly)
>>
>>Hi Miguel,
>>
>>The comparison was indeed run largely using your scripts.
>>I didn't notice any missteps but you never know of course.
>>Hope they can pinpoint the issue.
>>
>>Cheers,
>>Jonas
>>
>>
>>On 3 Feb 2017, at 19:06, Miguel Vazquez
>><miguel.vazquez at cnio.es<mailto:miguel.vazquez at cnio.es>> wrote:
>>
>>Excellent Jonas, this is very useful info.
>>
>> I guess you are using my own scripts for this. The possibility remains
>>that there is a misstep in them regarding delly. Let's see what turns out
>>of the checks by our friends at DKFZ.
>>
>>Best regards
>>
>>Have a great weekend
>>
>>Miguel
>>
>>On Feb 3, 2017 6:10 PM, "Jonas Demeulemeester"
>><Jonas.Demeulemeester at crick.ac.uk<mailto:Jonas.Demeulemeester at crick.ac.uk
>>>
>>> wrote:
>>Dear all,
>>
>>Also for the DKFZ (+Delly) workflow, I can confirm Miguel¹s results on
>>samples DO52140 and DO50311.
>>The dockerised pipelines return identical calls for SNV.MNVs and indels
>>but partly different ones for SVs and CNVs, independent of the
>>infrastructure.
>>
>>Best regards,
>>Jonas
>>
>>
>>
>>
>>
>>RESULTS DO52140
>>---------------
>>
>>Comparison of somatic.sv<http://somatic.sv> for DO52140 using DKFZ
>>---
>>Common: 72
>>Extra: 23
>>    - Example: 
>>10:132840774:N:<DEL>,11:38252019:N:<TRA>,11:47700673:N:<TRA>
>>Missing: 61
>>    - Example: 10:134749140:N:<DEL>,11:179191:N:<TRA>,11:38252005:N:<TRA>
>>
>>
>>Comparison of germline.sv<http://germline.sv> for DO52140 using DKFZ
>>---
>>Common: 1108
>>Extra: 1116
>>    - Example: 
>>10:102158308:N:<DEL>,10:104645247:N:<DEL>,10:105097522:N:<DEL>
>>Missing: 2908
>>    - Example: 
>>10:100107032:N:<TRA>,10:100107151:N:<TRA>,10:102158345:N:<DEL>
>>
>>
>>Comparison of somatic.snv.mnv for DO52140 using DKFZ
>>---
>>Common: 37160
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of germline.snv.mnv for DO52140 using DKFZ
>>---
>>Common: 3833896
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of somatic.indel for DO52140 using DKFZ
>>---
>>Common: 19347
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of germline.indel for DO52140 using DKFZ
>>---
>>Common: 706572
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of somatic.cnv for DO52140 using DKFZ
>>---
>>Common: 275
>>Extra: 94
>>    - Example: 
>>1:106505931:N:<LOH>,1:109068899:N:<DEL>,1:109359995:N:<DEL>
>>Missing: 286
>>    - Example: 10:88653561:N:<LOH>,11:179192:N:<LOH>,11:38252006:N:<LOH>
>>
>>
>>
>>
>>RESULTS DO50311
>>---------------
>>
>>Comparison of somatic.sv<http://somatic.sv> for DO50311 using DKFZ
>>---
>>Common: 231
>>Extra: 44
>>    - Example: 
>>10:20596800:N:<TRA>,10:56066821:N:<TRA>,11:16776092:N:<TRA>
>>Missing: 48
>>    - Example: 
>>10:119704959:N:<INV>,10:13116322:N:<TRA>,10:47063485:N:<TRA>
>>
>>
>>Comparison of germline.sv<http://germline.sv> for DO50311 using DKFZ
>>---
>>Common: 1393
>>Extra: 231
>>    - Example: 
>>10:134319313:N:<DEL>,10:134948976:N:<DEL>,10:19996638:N:<DEL>
>>Missing: 615
>>    - Example: 
>>10:101851839:N:<TRA>,10:101851884:N:<TRA>,10:10745225:N:<DUP>
>>
>>
>>Comparison of somatic.snv.mnv for DO50311 using DKFZ
>>---
>>Common: 51087
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of germline.snv.mnv for DO50311 using DKFZ
>>---
>>Common: 3850992
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of somatic.indel for DO50311 using DKFZ
>>---
>>Common: 26469
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of germline.indel for DO50311 using DKFZ
>>---
>>Common: 709060
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of somatic.cnv for DO50311 using DKFZ
>>---
>>Common: 731
>>Extra: 213
>>    - Example: 
>>10:132510034:N:<DEL>,10:20596801:N:<NEUTRAL>,10:47674883:N:<NEUTRAL>
>>Missing: 190
>>    - Example: 
>>10:100891940:N:<NEUTRAL>,10:104975905:N:<NEUTRAL>,10:119704960:N:<NEUTRAL
>>>
>>
>>
>>
>>
>>
>>_________________________________
>>Jonas Demeulemeester, PhD
>>Postdoctoral Researcher
>>The Francis Crick Institute
>>1 Midland Road
>>London
>>NW1 1AT
>>
>>T: +44 (0)20 3796 2594<tel:+44%2020%203796%202594>
>>M: +44 (0)7482 070730<tel:+44%207482%20070730>
>>E: jonas.demeulemeester at crick.ac.uk
>>W: www.crick.ac.uk
>>
>>
>>
>>On 26 Jan 2017, at 13:41, Jonas Demeulemeester
>><Jonas.Demeulemeester at crick.ac.uk<mailto:Jonas.Demeulemeester at crick.ac.uk
>>>
>>> wrote:
>>
>>Hi all,
>>
>>I can now confirm Miguel¹s results with the Sanger workflow on donors
>>DO50311 and DO52140.
>>The calls made by the dockerised version are identical for Indels and SVs
>>and produce only small discrepancies for SNV_MNVs and CNVs.
>>The discrepancies seem independent of the system infrastructure as the
>>number of missing/extra variants called are the same as Miguel¹s reported
>>previously (on DO52140)
>>
>>I¹ve also updated the wiki page accordingly.
>>
>>Best regards,
>>Jonas
>>
>>
>>
>>RESULTS - DO50311
>>------
>>
>>
>>Comparison of cnv for DO50311 using Sanger
>>---
>>Common: 138
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of indel for DO50311 using Sanger
>>---
>>Common: 812487
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of snv_mnv for DO50311 using Sanger
>>---
>>Common: 156313
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of sv for DO50311 using Sanger
>>---
>>Common: 260
>>Extra: 0
>>Missing: 0
>>
>>
>>
>>
>>
>>RESULTS - DO52140
>>------
>>
>>
>>Comparison of cnv for DO52140 using Sanger
>>---
>>Common: 36
>>Extra: 0
>>Missing: 2
>>    - Example: 10:11767915:T:<CNV>,10:11779907:G:<CNV>
>>
>>
>>Comparison of indel for DO52140 using Sanger
>>---
>>Common: 803986
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of svn_mnv for DO52140 using Sanger
>>---
>>Common: 87234
>>Extra: 5
>>    - Example: 1:23719098:G,12:43715930:A,20:4058335:A
>>Missing: 7
>>    - Example: 10:6881937:T,1:148579866:G,11:9271589:A
>>
>>
>>Comparison of sv for DO52140 using Sanger
>>---
>>Common: 6
>>Extra: 0
>>Missing: 0
>>
>>
>>
>>
>>
>>
>>For comparison, Miguel¹s report on DO51240:
>>Report
>>~~~~~~
>>
>>Comparison of somatic.cnv for DO52140 using Sanger
>>---
>>Common: 36
>>Extra: 0
>>Missing: 2
>>    - Example: 10:11767915:T:<CNV>,10:11779907:G:<CNV>
>>
>>
>>Comparison of somatic.indel for DO52140 using Sanger
>>---
>>Common: 803986
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of somatic.snv.mnv for DO52140 using Sanger
>>---
>>Common: 87234
>>Extra: 5
>>    - Example: 1:23719098:A:G,12:43715930:T:A,20:4058335:T:A
>>Missing: 7
>>    - Example: 10:6881937:A:T,1:148579866:A:G,11:9271589:T:A
>>
>>
>>Comparison of somatic.sv<http://somatic.sv> for DO52140 using Sanger
>>---
>>Common: 6
>>Extra: 0
>>Missing: 0
>>
>>
>>_________________________________
>>Jonas Demeulemeester, PhD
>>Postdoctoral Researcher
>>The Francis Crick Institute
>>1 Midland Road
>>London
>>NW1 1AT
>>
>>T: +44 (0)20 3796 2594<tel:+44%2020%203796%202594>
>>M: +44 (0)7482 070730<tel:+44%207482%20070730>
>>E: jonas.demeulemeester at crick.ac.uk
>>W: www.crick.ac.uk
>>
>>
>>
>>On 16 Jan 2017, at 14:24, Miguel Vazquez
>><miguel.vazquez at cnio.es<mailto:miguel.vazquez at cnio.es>> wrote:
>>
>>Dear all,
>>
>>Let me summarize the status of the testing for Sanger and DKFZ. The
>>validation has been run for two donors for each workflow: DO50311 DO52140
>>
>>Sanger:
>>----------
>>
>>Sanger call only somatic variants. The results are identical for Indels
>>and SVs but almost identical for SNV.MNV and CNV. The discrepancies are
>>reproducible (on the same machine at least), i.e. the same are found
>>after running the workflow a second time.
>>
>>DKFZ:
>>---------
>>DKFZ cals somatic and germline variants, except germline CNVs. For both
>>germline and somatic variants the results are identical for SNV.MNV and
>>Indels but with large discrepancies for SV and CNV.
>>
>>
>>Kortine Kleinheinz and Joachim Weischenfeldt are in the process of
>>investigating this issue I believe.
>>
>>BWA-Mem failed for me and has also failed for Denis Yuen and Jonas
>>Demeulemeester. Denis I believe is investigating this problem further. I
>>haven't had the chance to investigate this much myself.
>>
>>Best
>>
>>Miguel
>>
>>
>>
>>
>>---------------------
>>RESULTS
>>---------------------
>>
>>ubuntu at ip-10-253-35-14:~/DockerTest-Miguel$ cat results.txt
>>
>>Comparison of somatic.snv.mnv for DO50311 using DKFZ
>>---
>>Common: 51087
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of somatic.indel for DO50311 using DKFZ
>>---
>>Common: 26469
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of somatic.sv<http://somatic.sv/> for DO50311 using DKFZ
>>---
>>Common: 231
>>Extra: 44
>>    - Example: 
>>10:20596800:N:<TRA>,10:56066821:N:<TRA>,11:16776092:N:<TRA>
>>Missing: 48
>>    - Example: 
>>10:119704959:N:<INV>,10:13116322:N:<TRA>,10:47063485:N:<TRA>
>>
>>
>>Comparison of somatic.cnv for DO50311 using DKFZ
>>---
>>Common: 731
>>Extra: 213
>>    - Example: 
>>10:132510034:N:<DEL>,10:20596801:N:<NEUTRAL>,10:47674883:N:<NEUTRAL>
>>Missing: 190
>>    - Example: 
>>10:100891940:N:<NEUTRAL>,10:104975905:N:<NEUTRAL>,10:119704960:N:<NEUTRAL
>>>
>>
>>
>>Comparison of germline.snv.mnv for DO50311 using DKFZ
>>---
>>Common: 3850992
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of germline.indel for DO50311 using DKFZ
>>---
>>Common: 709060
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of germline.sv<http://germline.sv/> for DO50311 using DKFZ
>>---
>>Common: 1393
>>Extra: 231
>>    - Example: 
>>10:134319313:N:<DEL>,10:134948976:N:<DEL>,10:19996638:N:<DEL>
>>Missing: 615
>>    - Example: 
>>10:101851839:N:<TRA>,10:101851884:N:<TRA>,10:10745225:N:<DUP>
>>
>>File not found 
>>/mnt/1TB/work/DockerTest-Miguel/tests/DKFZ/DO50311//output//DO50311.germl
>>i
>>ne.cnv.vcf.gz
>>
>>Comparison of somatic.snv.mnv for DO52140 using DKFZ
>>---
>>Common: 37160
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of somatic.indel for DO52140 using DKFZ
>>---
>>Common: 19347
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of somatic.sv<http://somatic.sv/> for DO52140 using DKFZ
>>---
>>Common: 72
>>Extra: 23
>>    - Example: 
>>10:132840774:N:<DEL>,11:38252019:N:<TRA>,11:47700673:N:<TRA>
>>Missing: 61
>>    - Example: 10:134749140:N:<DEL>,11:179191:N:<TRA>,11:38252005:N:<TRA>
>>
>>
>>Comparison of somatic.cnv for DO52140 using DKFZ
>>---
>>Common: 275
>>Extra: 94
>>    - Example: 
>>1:106505931:N:<LOH>,1:109068899:N:<DEL>,1:109359995:N:<DEL>
>>Missing: 286
>>    - Example: 10:88653561:N:<LOH>,11:179192:N:<LOH>,11:38252006:N:<LOH>
>>
>>
>>Comparison of germline.snv.mnv for DO52140 using DKFZ
>>---
>>Common: 3833896
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of germline.indel for DO52140 using DKFZ
>>---
>>Common: 706572
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of germline.sv<http://germline.sv/> for DO52140 using DKFZ
>>---
>>Common: 1108
>>Extra: 1116
>>    - Example: 
>>10:102158308:N:<DEL>,10:104645247:N:<DEL>,10:105097522:N:<DEL>
>>Missing: 2908
>>    - Example: 
>>10:100107032:N:<TRA>,10:100107151:N:<TRA>,10:102158345:N:<DEL>
>>
>>File not found 
>>/mnt/1TB/work/DockerTest-Miguel/tests/DKFZ/DO52140//output//DO52140.germl
>>i
>>ne.cnv.vcf.gz
>>
>>Comparison of somatic.snv.mnv for DO50311 using Sanger
>>---
>>Common: 156299
>>Extra: 1
>>    - Example: Y:58885197:A:G
>>Missing: 14
>>    - Example: 1:102887902:A:T,1:143165228:C:G,16:87047601:A:C
>>
>>
>>Comparison of somatic.indel for DO50311 using Sanger
>>---
>>Common: 812487
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of somatic.sv<http://somatic.sv/> for DO50311 using Sanger
>>---
>>Common: 260
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of somatic.cnv for DO50311 using Sanger
>>---
>>Common: 138
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of somatic.snv.mnv for DO52140 using Sanger
>>---
>>Common: 87234
>>Extra: 5
>>    - Example: 1:23719098:A:G,12:43715930:T:A,20:4058335:T:A
>>Missing: 7
>>    - Example: 10:6881937:A:T,1:148579866:A:G,11:9271589:T:A
>>
>>
>>Comparison of somatic.indel for DO52140 using Sanger
>>---
>>Common: 803986
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of somatic.sv<http://somatic.sv/> for DO52140 using Sanger
>>---
>>Common: 6
>>Extra: 0
>>Missing: 0
>>
>>
>>Comparison of somatic.cnv for DO52140 using Sanger
>>---
>>Common: 36
>>Extra: 0
>>Missing: 2
>>    - Example: 10:11767915:T:<CNV>,10:11779907:G:<CNV>
>>_______________________________________________
>>docktesters mailing list
>>docktesters at lists.icgc.org<mailto:docktesters at lists.icgc.org>
>>https://lists.icgc.org/mailman/listinfo/docktesters
>>
>>
>>The Francis Crick Institute Limited is a registered charity in England
>>and Wales no. 1140062 and a company registered in England and Wales no.
>>06885462, with its registered office at 1 Midland Road London NW1 1AT
>>
>>_______________________________________________
>>docktesters mailing list
>>docktesters at lists.icgc.org<mailto:docktesters at lists.icgc.org>
>>https://lists.icgc.org/mailman/listinfo/docktesters
>>
>>
>>The Francis Crick Institute Limited is a registered charity in England
>>and Wales no. 1140062 and a company registered in England and Wales no.
>>06885462, with its registered office at 1 Midland Road London NW1 1AT
>>
>>_______________________________________________
>>docktesters mailing list
>>docktesters at lists.icgc.org<mailto:docktesters at lists.icgc.org>
>>https://lists.icgc.org/mailman/listinfo/docktesters
>>
>>
>>The Francis Crick Institute Limited is a registered charity in England
>>and Wales no. 1140062 and a company registered in England and Wales no.
>>06885462, with its registered office at 1 Midland Road London NW1 1AT
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: DO51087.pptx
Type: application/vnd.openxmlformats-officedocument.presentationml.presentation
Size: 617957 bytes
Desc: DO51087.pptx
URL: <https://lists.icgc.org/mailman/private/docktesters/attachments/20170213/22f7e7f0/attachment-0001.pptx>


More information about the docktesters mailing list