[DOCKTESTERS] Sanger extended validation for 1 sample. Only differences in SNV

Keiran Raine kr2 at sanger.ac.uk
Fri Dec 16 06:33:46 EST 2016


Hi Miguel,

Please be aware that we agreed to include the last version of the algorithms used in the core analysis for the final docker (as there were fixes along the way).

There were several versions of the core SNV caller fixing edge cases.  I don't see these discrepancies being an issue, especially if they aren't marked as 'PASS'.

It's theoretically possible for some of these to be floating-point differences (esp. if different CPU arch), this can cause a call to flip between the SNP and SUB output if very close to the cut-off.

Sending the physical VCF records (with the VCF header) for these from the relevant run (missing from the old run, Extra from the new) would allow a confirmation, but as I say if they aren't marked 'PASS' we wouldn't be planning to do any thing about them.

Regards,

Keiran Raine
Principal Bioinformatician
Cancer Genome Project
Wellcome Trust Sanger Institute

kr2 at sanger.ac.uk
Tel:+44 (0)1223 834244 Ext: 4983
Office: H104

> On 16 Dec 2016, at 10:50, Miguel Vazquez <miguel.vazquez at cnio.es> wrote:
> 
> Dear Christina and Keiran,
> 
> I've extended the analysis also for the Sanger workflow. I only have one sample (DO50311) since the second one (DO52140) its still computing since December 8 (8 days)
> 
> Keiran I believed you asked me about how indels and CNVs matched. Is this what you needed?
> 
> All matches except for the +1 -14 differences I reported before
> 
> Best regards
> 
> Miguel
> 
> Report
> ~~~~~
> 
> Comparison of somatic.cnv for DO50311 using Sanger
> ---
> Common: 138
> Extra: 0
> Missing: 0
> 
> 
> Comparison of somatic.indel for DO50311 using Sanger
> ---
> Common: 812487
> Extra: 0
> Missing: 0
> 
> 
> Comparison of somatic.snv.mnv for DO50311 using Sanger
> ---
> Common: 156299
> Extra: 1
>     - Example: Y:58885197:A:G
> Missing: 14
>     - Example: 1:102887902:A:T,1:143165228:C:G,16:87047601:A:C
> 
> 
> Comparison of somatic.sv <http://somatic.sv/> for DO50311 using Sanger
> ---
> Common: 260
> Extra: 0
> Missing: 0
> 
> 
> *Note. To make matching more stringent in indels I've added the reference to the mutation code end up comparing. This extends to SNV as well so where previously I wrote 16:87047601:C I now write 16:87047601:A:C. The extremely thorough reader will notice that the reports for DKFZ bellow show the discrepancies in CNV not following this new format. I've introduced this afterwards, but the results have not changed for DKFZ; I've checked.
> 
> On Fri, Dec 16, 2016 at 11:07 AM, Miguel Vazquez <miguel.vazquez at cnio.es <mailto:miguel.vazquez at cnio.es>> wrote:
> Excuse me, obviously I meant
> 
> For the two samples all matches perfectly except CNV where we find some large differences.
> 
> On Fri, Dec 16, 2016 at 11:05 AM, Miguel Vazquez <miguel.vazquez at cnio.es <mailto:miguel.vazquez at cnio.es>> wrote:
> Hi Christina et al.
> 
> Like you asked me I've extended the validation from SNV to Indel and SNV and also for germline
> 
> For the two samples all matches perfectly except SNV where we find some large differences. 
> 
> Best regards
> 
> Miguel
> 
> Report
> ~~~~~
> 
> Comparison of germline.indel for DO50311 using DKFZ
> ---
> Common: 709060
> Extra: 0
> Missing: 0
> 
> 
> Comparison of germline.snv.mnv for DO50311 using DKFZ
> ---
> Common: 3850992
> Extra: 0
> Missing: 0
> 
> 
> Comparison of somatic.cnv for DO50311 using DKFZ
> ---
> Common: 731
> Extra: 213
>     - Example: 10:132510034:<DEL>,10:20596801:<NEUTRAL>,10:47674883:<NEUTRAL>
> Missing: 190
>     - Example: 10:100891940:<NEUTRAL>,10:104975905:<NEUTRAL>,10:119704960:<NEUTRAL>
> 
> 
> Comparison of somatic.indel for DO50311 using DKFZ
> ---
> Common: 26469
> Extra: 0
> Missing: 0
> 
> 
> Comparison of somatic.snv.mnv for DO50311 using DKFZ
> ---
> Common: 51087
> Extra: 0
> Missing: 0
> 
> 
> Comparison of germline.indel for DO52140 using DKFZ
> ---
> Common: 706572
> Extra: 0
> Missing: 0
> 
> 
> Comparison of germline.snv.mnv for DO52140 using DKFZ
> ---
> Common: 3833896
> Extra: 0
> Missing: 0
> 
> 
> Comparison of somatic.cnv for DO52140 using DKFZ
> ---
> Common: 275
> Extra: 94
>     - Example: 1:106505931:<LOH>,1:109068899:<DEL>,1:109359995:<DEL>
> Missing: 286
>     - Example: 10:88653561:<LOH>,11:179192:<LOH>,11:38252006:<LOH>
> 
> 
> Comparison of somatic.indel for DO52140 using DKFZ
> ---
> Common: 19347
> Extra: 0
> Missing: 0
> 
> 
> Comparison of somatic.snv.mnv for DO52140 using DKFZ
> ---
> Common: 37160
> Extra: 0
> Missing: 0
> 
> 
> 




-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.icgc.org/mailman/private/docktesters/attachments/20161216/e5eb5634/attachment.html>


More information about the docktesters mailing list