[DOCKTESTERS] BWA-Mem validation of DO51057 normal) 0.013% miss-matches, and 3.7% soft-matches, tumor) 0.043% miss-matches, and 4.64% soft-matches

Miguel Vazquez miguel.vazquez at cnio.es
Tue Apr 11 10:39:21 EDT 2017


Hi Christina,

Fortunately all the steps are in scripts we we won't forget how it's done.

What you ask for is actually not really accurate anymore. The process of
reverting the aligned BAM is not what we ended up doing since it leads to a
3% rate of miss-matches. First we though that splitting the problem was
that the BWA was not ran over a single BAM but over a set of smaller BAM
files, but splitting the BAM didn't not address the 3% miss-match ratio. I
think it was Keiran we came to the conclusion that the problem was the
order of the reads inside the BAM. Junjun found that the DKFZ had saved
some of the original unaligned BAM files, which is what we ended up using
and reduced the miss-match rate from 3% to 0.014% and 0.04% for the normal
and tumor BAMs. Our latest discussion was about the order in which those
original BAMs are entered into the BWA to explain those small remaining
percentages; we have gained some insights but we don't plan to get to the
bottom of it.

Bottom line is that there is not process of reverting the aligned BAM into
the unaligned prior to running the BWA, rather the original unaligned BAMs
must be used.

Its a pity that it works this way, however it is not a realistic use-case
anyway to revert an already aligned BAM, is it?  If it's OK with you
Christina we can still document this on the repo. Do you mind writing this
down Jonas? I think you understand this better than I do.

Best regards

Miguel

On Tue, Apr 11, 2017 at 4:18 PM, Christina Yung <christina.yung at oicr.on.ca>
wrote:

> Hi Jonas & Miguel,
>
> After the effort you've put in to identify issues, could you document the
> steps to reproduce the alignment similar to production runs?  In other
> words, as a user who has downloaded a PCAWG aligned BAM, how do I revert
> back to lane BAMs, and input the lane BAMs in the right order to the
> docker?
>
> Denis, will github (https://github.com/ICGC-TCGA-
> PanCancer/Seqware-BWA-Workflow) be appropriate for such instructions?
>
> Thanks,
> Christina
>
> On 4/11/2017 9:10 AM, Miguel Vazquez wrote:
>
> Hi Jonas,
>
> I agree with leaving the matter rest following Lincoln's input last TC.
> There might be a lesson to be learned here, but unless someone here prompts
> us again about this we should move on. About your conclusion on BAM order
> and read order, I agree, it seems like read order is more important than
> BAM order, anyway I think you have a better understanding of this subject
> than I do.
>
> About the test that George is going to conduct, I guess he'll be fine just
> using my version of the scripts. At some point I'll try to incorporate your
> version of the BAM orders into my scripts so our version converge back. Not
> a pressing issue for now I think.
>
> Best regards
>
> Miguel
>
> On Tue, Apr 11, 2017 at 1:51 PM, Jonas Demeulemeester <
> Jonas.Demeulemeester at crick.ac.uk> wrote:
>
>> Hi Miguel,
>>
>> Thanks for the update, that was indeed the order I was looking for.
>> And you’re right, I was too quick, *only the number of matches is
>> exactly identical.*
>> The 24 (normal) and 48 (tumor) mismatches that became soft-matches in my
>> run I guess may be cases where a uniquely mapping and a non-uniquely
>> mapping read are flagged as duplicates and one is removed in your run and
>> the other in mine and the original run.
>> We could have a look at these reads, and try to trace this issue, but as
>> Lincoln mentioned, maybe we should switch our focus to the other
>> containers, and consider BWA-Mem validated given the observed small
>> mismatch rates.
>>
>> It seems as if differences in the sequence of bam file processing only
>> have a small effect on the final result (in this case at least).
>> Could it be that the order of reads within the original bams has a bigger
>> effect than the order of the bams themselves?
>>
>> Thanks,
>> Jonas
>>
>> _________________________________
>> Jonas Demeulemeester, PhD
>> Postdoctoral Researcher
>> The Francis Crick Institute
>> 1 Midland Road
>> London
>> NW1 1AT
>>
>> *T:* +44 (0)20 3796 2594 <+44%2020%203796%202594>
>> M: +44 (0)7482 070730 <+44%207482%20070730>
>> *E:* jonas.demeulemeester at crick.ac.uk
>> *W:* www.crick.ac.uk
>>
>>
>>
>> On 11 Apr 2017, at 12:00, Miguel Vazquez <miguel.vazquez at cnio.es> wrote:
>>
>> Hi Jonas,
>>
>> About the BAM order in the header I have some lines that start with @PG
>> and then have a "CL:" field with the command line, I guess you are
>> referring to those. The order is actually 1,2,3,4,5 and 6, which is the one
>> I used.
>>
>> About the numbers, they are almost the same, yet not entirely the same.
>> There are 442902 miss-matches in yours and 442926 in mine. So it appears
>> that 24 of my miss-matches became soft-matches in yours. I would have
>> expected a move between matches and soft-matches but not with miss-matches
>> and soft-matches. It's a bit odd. I could send you the list of my
>> miss-matches and we can find out which are the ones that moved and find out
>> why.
>>
>> On Tue, Apr 11, 2017 at 11:43 AM, Jonas Demeulemeester <
>> Jonas.Demeulemeester at crick.ac.uk> wrote:
>>
>>> Hi all,
>>>
>>> I’ve completed the testing run of the BWA-Mem docker on PCAWG donor
>>> DO51057.
>>> Briefly, like Miguel’s run, this test used the original unaligned bam
>>> files for DO51057, but feeds them via the JSON file into the docker in a
>>> slightly different order (as recorded in the @PG/@CL lines in the original
>>> mapped PCAWG bams)
>>> Results of the comparison are as follows:
>>>
>>> *Matched normal*:
>>> Lines: 1125172217
>>> Matches: 1083221794
>>> Misses: 143668
>>> Soft: 41806755
>>>
>>> *Tumor:*
>>> Lines: 1010685786
>>> Matches: 963319037 <963%2031%2090%2037>
>>> Misses: 442902
>>> Soft: 46923847
>>>
>>> Which are *exactly the numbers reported by Miguel *(resulting in
>>> *0.043%* and *0.013%* mismatch rates).
>>> The fact that the numbers match exactly comes as a bit of a surprise I
>>> think, but shows that the current pipeline is highly reproducible, even
>>> across platforms.
>>>
>>> @Miguel, could you verify the order of mapping of the different read
>>> groups in the header of your newly mapped bams.
>>> For the original and newly mapped normal bams the order recorded in the
>>> @CL lines is CPCG_0098_Ly_R_PE_517_WG.*3 - 6 - 1 - 4 - 2 - 5*
>>> For the tumor bams the order is: CPCG_0098_Pr_P_PE_500_WG.*5 - 3 - 6 -
>>> 2 - 4 - 1*
>>> Do you observe the same or a different order?
>>> If it’s the same, then the pipeline does some internal reordering and
>>> the order of records in the JSON doesn’t matter.
>>> If not, then the order of the bams doesn’t seem to matter as much (at
>>> least in this case), but maybe rather the order of reads within the bams
>>> (as evidenced by our high error rates previously).
>>>
>>> Looking forward to hearing your thoughts on this!
>>> Jonas
>>>
>>>
>>> _________________________________
>>> Jonas Demeulemeester, PhD
>>> Postdoctoral Researcher
>>> The Francis Crick Institute
>>> 1 Midland Road
>>> London
>>> NW1 1AT
>>>
>>> *T:* +44 (0)20 3796 2594 <+44%2020%203796%202594>
>>> M: +44 (0)7482 070730 <+44%207482%20070730>
>>> *E:* jonas.demeulemeester at crick.ac.uk
>>> *W:* www.crick.ac.uk
>>>
>>>
>>>
>>> On 10 Apr 2017, at 13:15, Miguel Vazquez <miguel.vazquez at cnio.es> wrote:
>>>
>>> Hi all,
>>>
>>> The comparison with the *tumor BAM* for DO51057 has completed with *
>>> rates of miss-maches (**0.043%) and soft-matches (**4.64%) just
>>> slightly higher* *than for the* *normal BAM*. These *numbers are not
>>> definitive* since as you can read from Jonas just below, *there might
>>> still be a discrepancy* in the order in which the BAM where processed.
>>> We'll soon know from Jonas if a different order will fix these rates even
>>> more.
>>>
>>> *Lines*: 1010685786
>>> *Matches*: 963319037 <963%2031%2090%2037>
>>> *Misses*: 442926
>>> *Soft*: 46923823
>>>
>>> Best regards
>>>
>>> Miguel
>>>
>>> On Mon, Apr 10, 2017 at 1:14 PM, Jonas Demeulemeester <
>>> Jonas.Demeulemeester at crick.ac.uk> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I’m currently running the comparison of the BWA-Mem docker reproduced
>>>> bams and the PCAWG ones for DO51057.
>>>> I should be able to send a report some time today.
>>>>
>>>> Miguel, looking at your code, I believe you’re feeding the unaligned
>>>> bams into the pipeline in the order given by the read group lines (@RG) in
>>>> the header of the PCAWG bam.
>>>> I’m using the order recorded in the command line/programs used
>>>> (@CL/@PG) lines of the PCAWG bam, which is often different for whatever
>>>> reason.
>>>> I’m not entirely sure which one is the correct one, but I’m guessing
>>>> the one in the @CL/@PG lines is the actual one as it chronologically
>>>> reiterates the whole procedure ( [align - sort] x N followed by merge +
>>>> flag dups )
>>>> If this is the case, the true % mismatches may be lower still than
>>>> 0.013%, if not, then I should see a higher mismatch rate and the 0.013% is
>>>> due to something else still.
>>>>
>>>> Regarding the soft-matches, I agree with Junjun, we may want to ask the
>>>> people behind the variant callers, but I guess they are probably dealing
>>>> with these multiply-mapping reads internally.
>>>>
>>>> Best,
>>>> Jonas
>>>>
>>>>
>>>> _________________________________
>>>> Jonas Demeulemeester, PhD
>>>> Postdoctoral Researcher
>>>> The Francis Crick Institute
>>>> 1 Midland Road
>>>> London
>>>> NW1 1AT
>>>>
>>>> *T:* +44 (0)20 3796 2594 <+44%2020%203796%202594>
>>>> M: +44 (0)7482 070730 <+44%207482%20070730>
>>>> *E:* jonas.demeulemeester at crick.ac.uk
>>>> *W:* www.crick.ac.uk
>>>>
>>>>
>>>>
>>>> On 9 Apr 2017, at 15:47, Junjun Zhang <Junjun.Zhang at oicr.on.ca> wrote:
>>>>
>>>> Hi Miguel,
>>>>
>>>> This is indeed good news, the mismatch is significantly lower.
>>>>
>>>> Regarding soft matches, thanks for the explanation. I wonder whether it
>>>> has impact (or how much impact) on variant calls, do variant callers take
>>>> into account the information that a read may map to multiple places? Does
>>>> it make adjustment at the time of variant calling? I guess these are
>>>> questions for variant caller authors.
>>>>
>>>> Thanks,
>>>> Junjun
>>>>
>>>>
>>>>
>>>> From: <docktesters-bounces+junjun.zhang=oicr.on.ca at lists.icgc.org> on
>>>> behalf of Miguel Vazquez <miguel.vazquez at cnio.es>
>>>> Date: Thursday, April 6, 2017 at 11:36 AM
>>>> To: Lincoln Stein <lincoln.stein at gmail.com>
>>>> Cc: Francis Ouellette <francis at oicr.on.ca>, Keiran Raine <
>>>> kr2 at sanger.ac.uk>, "docktesters at lists.icgc.org" <
>>>> docktesters at lists.icgc.org>
>>>> Subject: Re: [DOCKTESTERS] BWA-Mem validation of DO51057 (normal BAM
>>>> only). 96.3% matches, 0.013% miss-matches, and 3.7% soft-matches
>>>>
>>>> Hi Lincoln,
>>>>
>>>> Soft-match means that the alignment position in the new BAM is not the
>>>> same is the one in the original BAM, but is included in the list of
>>>> alternative alignments for that read.
>>>>
>>>> For instance, the original bam aligns a read to chr 1 pos 1000, but
>>>> also admits that is could be aligned at chr 2 pos 2000 or chr 3 pos 3000,
>>>> the new bam aligns it at chr 2 pos 2000, which is not the position chosen
>>>> by the original BAM but is in the alternative list. It could also work the
>>>> other way, that the original position is included in the list of
>>>> alternative positions of the new BAM
>>>>
>>>> I hope this was clear.
>>>>
>>>> Best regards
>>>>
>>>> Miguel
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Apr 6, 2017 at 4:55 PM, Lincoln Stein <lincoln.stein at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Miguel,
>>>>>
>>>>> Sounds like a significant achievement! But remind me what a "soft
>>>>> match" is?
>>>>>
>>>>> Lincoln
>>>>>
>>>>> On Thu, Apr 6, 2017 at 10:28 AM, Miguel Vazquez <
>>>>> miguel.vazquez at cnio.es> wrote:
>>>>>
>>>>>> Dear all,
>>>>>>
>>>>>> This is just an advance teaser for the BWA-Mem validation after the
>>>>>> latest changes, it is currently running over the tumor BAM, but the normal
>>>>>> BAM has completed and the *missmatches are two orders of magnitude
>>>>>> lower* than in our two previous attempts. Before further discussion
>>>>>> here are the raw numbers:
>>>>>>
>>>>>> Lines: 1125172217
>>>>>> Matches: 1083221794
>>>>>> *Misses: 143716*
>>>>>> Soft: 41806707
>>>>>>
>>>>>> If my calculation are correct this means 96.3% matches, *0.013%
>>>>>> miss-matches*, and 3.7% soft-matches.
>>>>>>
>>>>>> The fix was two part. First realizing that the input of this process
>>>>>> should not be a single unaligned version of the output BAMs, but several
>>>>>> input BAMs. Breaking down the output bam into it's constituent BAMs, by a
>>>>>> process implemented by Jonas, dit not address the problem unfortunately.
>>>>>> After this first attempt it was pointed out to us, I think by Keiran, that
>>>>>> the order of the reads matter, and so our attempt to work back from the
>>>>>> output BAM was not going to work. Junjun came back to us with the second
>>>>>> part of the fix, he located a subset of original unaligned BAMs in the DKFZ
>>>>>> that we could use. Downloading these BAM files and submitting them to
>>>>>> BWA-Mem in the same order as was specified in the output BAM header
>>>>>> achieved these promising results.
>>>>>>
>>>>>> I will reply this message in a few days with the corresponding
>>>>>> numbers for the other BAM, the tumor, which is currently running.
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>> Miguel
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Feb 19, 2017 at 1:43 PM, Miguel Vazquez <
>>>>>> miguel.vazquez at cnio.es> wrote:
>>>>>>
>>>>>>> Dear all,
>>>>>>>
>>>>>>> Great news! The BWA-Mem test on a real PCAWG donor succeed in
>>>>>>> running; achieving an overlap with the original BAM alignment similar to
>>>>>>> the HCC1143 test. The numbers are:
>>>>>>>
>>>>>>> Lines: 1708047647
>>>>>>> Matches: 1589172843
>>>>>>> Misses: 62726130
>>>>>>> Soft: 56148674
>>>>>>>
>>>>>>> Which mean 93% matches, 3.6% miss-matches, and 3.2% soft-matches.
>>>>>>> Compared to the HCC1143 test there are a few percentage points in matches
>>>>>>> that turn into soft-matches (95% and 1.3% to 93% and 3.2%), but the ratio
>>>>>>> of misses is very close 3.6%.
>>>>>>>
>>>>>>> I'm running this test on a second donor.
>>>>>>>
>>>>>>> Best regards
>>>>>>>
>>>>>>> Miguel
>>>>>>>
>>>>>>> On Tue, Feb 14, 2017 at 3:30 PM, Miguel Vazquez <
>>>>>>> miguel.vazquez at cnio.es> wrote:
>>>>>>>
>>>>>>>> Dear colleagues,
>>>>>>>>
>>>>>>>> I'm very happy to say that the BWA-Mem pipeline finished for the
>>>>>>>> HCC1143 data.
>>>>>>>>
>>>>>>>> I think what solved the problem was setting the headers to the
>>>>>>>> unaligned BAM files. I'm currently trying it out with the DO35937 donor,
>>>>>>>> but its too early to say if its working or not.
>>>>>>>>
>>>>>>>> To compare BAM files I've followed some advice that I found on the
>>>>>>>> internet https://www.biostars.org/p/166221/. I will detail them a
>>>>>>>> bit below because I would like some advice as to how appropriate the
>>>>>>>> approach is, but first here are the numbers:
>>>>>>>>
>>>>>>>> *Lines*: 74264390
>>>>>>>> *Matches*: 70565742
>>>>>>>> *Misses*: 2693687
>>>>>>>> *Soft*: 1004961
>>>>>>>>
>>>>>>>>
>>>>>>>> Which means *95% matches, 3.6% miss-matches, and 1.3% soft-matches*.
>>>>>>>> Matches are when the chromosome and position are the same, soft-matches are
>>>>>>>> when they are not the same but the position from one of the alignments is
>>>>>>>> included in the list of alternative positions for the other alignment (e.g
>>>>>>>> XA:Z:15,-102516528,76M,0), and misses are the rest.
>>>>>>>>
>>>>>>>> Here is the detailed process from the start. The comparison script
>>>>>>>> is here https://github.com/mikisvaz/PC
>>>>>>>> AWG-Docker-Test/blob/master/bin/compare_bwa_bam.sh
>>>>>>>>
>>>>>>>> 1) Un-align tumor and normal BAM files, retaining the original
>>>>>>>> aligned BAM files
>>>>>>>> 2) Run BWA-Mem wich produces a file called
>>>>>>>> HCC1143.merged_output.bam with alignments from both tumor and normal
>>>>>>>> 3) use samtools to extract the entries, limited for the first in
>>>>>>>> pair (?), cut the read-name, chromosome, position (??) and extra
>>>>>>>> information (for additional alignments) and sort them. We do this for the
>>>>>>>> original files and for the BWA-Mem merged_output file, but separating tumor
>>>>>>>> and normal entries (marked with the codes 'tumor' and 'normal', I believe
>>>>>>>> from the headers I set when un-aligning them)
>>>>>>>> 4) join the lines by read-name, separately for the tumor and normal
>>>>>>>> pairs of files, and check for matches
>>>>>>>>
>>>>>>>> I've two questions:
>>>>>>>> (?) Is it OK to select only the first in pair, its what the guy in
>>>>>>>> the example did, and it did simplify the code without repeated read-names
>>>>>>>> (??) I guess its OK to only check chromosome and position, the
>>>>>>>> cigar would be necessarily the same.
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>> Miguel
>>>>>>>>
>>>>>>>> On Mon, Jan 16, 2017 at 3:24 PM, Miguel Vazquez <
>>>>>>>> miguel.vazquez at cnio.es> wrote:
>>>>>>>>
>>>>>>>>> Dear all,
>>>>>>>>>
>>>>>>>>> Let me summarize the status of the testing for Sanger and DKFZ.
>>>>>>>>> The validation has been run for two donors for each workflow: DO50311
>>>>>>>>> DO52140
>>>>>>>>>
>>>>>>>>> Sanger:
>>>>>>>>> ----------
>>>>>>>>>
>>>>>>>>> Sanger call only somatic variants. The results are *identical for
>>>>>>>>> Indels and SVs* but *almost identical for SNV.MNV and CNV*. The
>>>>>>>>> discrepancies are reproducible (on the same machine at least), i.e. the
>>>>>>>>> same are found after running the workflow a second time.
>>>>>>>>>
>>>>>>>>> DKFZ:
>>>>>>>>> ---------
>>>>>>>>> DKFZ cals somatic and germline variants, except germline CNVs. For
>>>>>>>>> both germline and somatic variants the results are *identical for
>>>>>>>>> SNV.MNV and Indels* but with *large discrepancies for SV and CNV*.
>>>>>>>>>
>>>>>>>>> Kortine Kleinheinz and Joachim Weischenfeldt are in the process
>>>>>>>>> of investigating this issue I believe.
>>>>>>>>>
>>>>>>>>> BWA-Mem failed for me and has also failed for Denis Yuen and Jonas
>>>>>>>>> Demeulemeester. Denis I believe is investigating this problem further. I
>>>>>>>>> haven't had the chance to investigate this much myself.
>>>>>>>>>
>>>>>>>>> Best
>>>>>>>>>
>>>>>>>>> Miguel
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ---------------------
>>>>>>>>> RESULTS
>>>>>>>>> ---------------------
>>>>>>>>>
>>>>>>>>> ubuntu at ip-10-253-35-14:~/DockerTest-Miguel$ cat results.txt
>>>>>>>>>
>>>>>>>>> Comparison of somatic.snv.mnv for DO50311 using DKFZ
>>>>>>>>> ---
>>>>>>>>> Common: 51087
>>>>>>>>> Extra: 0
>>>>>>>>> Missing: 0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comparison of somatic.indel for DO50311 using DKFZ
>>>>>>>>> ---
>>>>>>>>> Common: 26469
>>>>>>>>> Extra: 0
>>>>>>>>> Missing: 0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comparison of somatic.sv for DO50311 using DKFZ
>>>>>>>>> ---
>>>>>>>>> Common: 231
>>>>>>>>> Extra: 44
>>>>>>>>>     - Example: 10:20596800:N:<TRA>,10:5606682
>>>>>>>>> 1:N:<TRA>,11:16776092:N:<TRA>
>>>>>>>>> Missing: 48
>>>>>>>>>     - Example: 10:119704959:N:<INV>,10:131163
>>>>>>>>> 22:N:<TRA>,10:47063485:N:<TRA>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comparison of somatic.cnv for DO50311 using DKFZ
>>>>>>>>> ---
>>>>>>>>> Common: 731
>>>>>>>>> Extra: 213
>>>>>>>>>     - Example: 10:132510034:N:<DEL>,10:205968
>>>>>>>>> 01:N:<NEUTRAL>,10:47674883:N:<NEUTRAL>
>>>>>>>>> Missing: 190
>>>>>>>>>     - Example: 10:100891940:N:<NEUTRAL>,10:10
>>>>>>>>> 4975905:N:<NEUTRAL>,10:119704960:N:<NEUTRAL>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comparison of germline.snv.mnv for DO50311 using DKFZ
>>>>>>>>> ---
>>>>>>>>> Common: 3850992
>>>>>>>>> Extra: 0
>>>>>>>>> Missing: 0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comparison of germline.indel for DO50311 using DKFZ
>>>>>>>>> ---
>>>>>>>>> Common: 709060
>>>>>>>>> Extra: 0
>>>>>>>>> Missing: 0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comparison of germline.sv for DO50311 using DKFZ
>>>>>>>>> ---
>>>>>>>>> Common: 1393
>>>>>>>>> Extra: 231
>>>>>>>>>     - Example: 10:134319313:N:<DEL>,10:134948
>>>>>>>>> 976:N:<DEL>,10:19996638:N:<DEL>
>>>>>>>>> Missing: 615
>>>>>>>>>     - Example: 10:101851839:N:<TRA>,10:101851
>>>>>>>>> 884:N:<TRA>,10:10745225:N:<DUP>
>>>>>>>>>
>>>>>>>>> File not found /mnt/1TB/work/DockerTest-Migue
>>>>>>>>> l/tests/DKFZ/DO50311//output//DO50311.germline.cnv.vcf.gz
>>>>>>>>>
>>>>>>>>> Comparison of somatic.snv.mnv for DO52140 using DKFZ
>>>>>>>>> ---
>>>>>>>>> Common: 37160
>>>>>>>>> Extra: 0
>>>>>>>>> Missing: 0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comparison of somatic.indel for DO52140 using DKFZ
>>>>>>>>> ---
>>>>>>>>> Common: 19347
>>>>>>>>> Extra: 0
>>>>>>>>> Missing: 0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comparison of somatic.sv for DO52140 using DKFZ
>>>>>>>>> ---
>>>>>>>>> Common: 72
>>>>>>>>> Extra: 23
>>>>>>>>>     - Example: 10:132840774:N:<DEL>,11:382520
>>>>>>>>> 19:N:<TRA>,11:47700673:N:<TRA>
>>>>>>>>> Missing: 61
>>>>>>>>>     - Example: 10:134749140:N:<DEL>,11:179191
>>>>>>>>> :N:<TRA>,11:38252005:N:<TRA>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comparison of somatic.cnv for DO52140 using DKFZ
>>>>>>>>> ---
>>>>>>>>> Common: 275
>>>>>>>>> Extra: 94
>>>>>>>>>     - Example: 1:106505931:N:<LOH>,1:10906889
>>>>>>>>> 9:N:<DEL>,1:109359995:N:<DEL>
>>>>>>>>> Missing: 286
>>>>>>>>>     - Example: 10:88653561:N:<LOH>,11:179192:
>>>>>>>>> N:<LOH>,11:38252006:N:<LOH>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comparison of germline.snv.mnv for DO52140 using DKFZ
>>>>>>>>> ---
>>>>>>>>> Common: 3833896
>>>>>>>>> Extra: 0
>>>>>>>>> Missing: 0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comparison of germline.indel for DO52140 using DKFZ
>>>>>>>>> ---
>>>>>>>>> Common: 706572
>>>>>>>>> Extra: 0
>>>>>>>>> Missing: 0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comparison of germline.sv for DO52140 using DKFZ
>>>>>>>>> ---
>>>>>>>>> Common: 1108
>>>>>>>>> Extra: 1116
>>>>>>>>>     - Example: 10:102158308:N:<DEL>,10:104645
>>>>>>>>> 247:N:<DEL>,10:105097522:N:<DEL>
>>>>>>>>> Missing: 2908
>>>>>>>>>     - Example: 10:100107032:N:<TRA>,10:100107
>>>>>>>>> 151:N:<TRA>,10:102158345:N:<DEL>
>>>>>>>>>
>>>>>>>>> File not found /mnt/1TB/work/DockerTest-Migue
>>>>>>>>> l/tests/DKFZ/DO52140//output//DO52140.germline.cnv.vcf.gz
>>>>>>>>>
>>>>>>>>> Comparison of somatic.snv.mnv for DO50311 using Sanger
>>>>>>>>> ---
>>>>>>>>> Common: 156299
>>>>>>>>> Extra: 1
>>>>>>>>>     - Example: Y:58885197:A:G
>>>>>>>>> Missing: 14
>>>>>>>>>     - Example: 1:102887902:A:T,1:143165228:C:G,16:87047601:A:C
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comparison of somatic.indel for DO50311 using Sanger
>>>>>>>>> ---
>>>>>>>>> Common: 812487
>>>>>>>>> Extra: 0
>>>>>>>>> Missing: 0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comparison of somatic.sv for DO50311 using Sanger
>>>>>>>>> ---
>>>>>>>>> Common: 260
>>>>>>>>> Extra: 0
>>>>>>>>> Missing: 0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comparison of somatic.cnv for DO50311 using Sanger
>>>>>>>>> ---
>>>>>>>>> Common: 138
>>>>>>>>> Extra: 0
>>>>>>>>> Missing: 0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comparison of somatic.snv.mnv for DO52140 using Sanger
>>>>>>>>> ---
>>>>>>>>> Common: 87234
>>>>>>>>> Extra: 5
>>>>>>>>>     - Example: 1:23719098:A:G,12:43715930:T:A,20:4058335:T:A
>>>>>>>>> Missing: 7
>>>>>>>>>     - Example: 10:6881937:A:T,1:148579866:A:G,11:9271589:T:A
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comparison of somatic.indel for DO52140 using Sanger
>>>>>>>>> ---
>>>>>>>>> Common: 803986
>>>>>>>>> Extra: 0
>>>>>>>>> Missing: 0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comparison of somatic.sv for DO52140 using Sanger
>>>>>>>>> ---
>>>>>>>>> Common: 6
>>>>>>>>> Extra: 0
>>>>>>>>> Missing: 0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Comparison of somatic.cnv for DO52140 using Sanger
>>>>>>>>> ---
>>>>>>>>> Common: 36
>>>>>>>>> Extra: 0
>>>>>>>>> Missing: 2
>>>>>>>>>     - Example: 10:11767915:T:<CNV>,10:11779907:G:<CNV>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> *Lincoln Stein*
>>>>>
>>>>> Scientific Director (Interim), Ontario Institute for Cancer Research
>>>>> Director, Informatics and Bio-computing Program, OICR
>>>>> Senior Principal Investigator, OICR
>>>>> Professor, Department of Molecular Genetics, University of Toronto
>>>>>
>>>>> <http://goog_1828306398/>
>>>>> *Ontario Institute for Cancer Research*
>>>>> MaRS Centre
>>>>> 661 University Avenue
>>>>> Suite 510
>>>>> Toronto, Ontario
>>>>> Canada M5G 0A3
>>>>>
>>>>> Tel: 416-673-8514
>>>>> Mobile: 416-817-8240
>>>>> Email: lincoln.stein at gmail.com
>>>>> Toll-free: 1-866-678-6427
>>>>> Twitter: @OICR_news
>>>>>
>>>>> *Executive Assistant*
>>>>> *Melisa Torres*
>>>>> Tel:  647-259-4253
>>>>> Email: melisa.torres at oicr.on.ca <stacey.quinn at oicr.on.ca>
>>>>> www.oicr.on.ca
>>>>>
>>>>> This message and any attachments may contain confidential and/or
>>>>> privileged information for the sole use of the intended recipient. Any
>>>>> review or distribution by anyone other than the person for whom it was
>>>>> originally intended is strictly prohibited. If you have received this
>>>>> message in error, please contact the sender and delete all copies.
>>>>> Opinions, conclusions or other information contained in this message may
>>>>> not be that of the organization.
>>>>>
>>>>> _______________________________________________
>>>>> docktesters mailing list
>>>>> docktesters at lists.icgc.org
>>>>> https://lists.icgc.org/mailman/listinfo/docktesters
>>>>>
>>>>>
>>>> _______________________________________________
>>>> docktesters mailing list
>>>> docktesters at lists.icgc.org
>>>> https://lists.icgc.org/mailman/listinfo/docktesters
>>>>
>>>>
>>>> The Francis Crick Institute Limited is a registered charity in England
>>>> and Wales no. 1140062 and a company registered in England and Wales no.
>>>> 06885462, with its registered office at 1 Midland Road London NW1 1AT
>>>>
>>>> _______________________________________________
>>>> docktesters mailing list
>>>> docktesters at lists.icgc.org
>>>> https://lists.icgc.org/mailman/listinfo/docktesters
>>>>
>>>>
>>>
>>> The Francis Crick Institute Limited is a registered charity in England
>>> and Wales no. 1140062 and a company registered in England and Wales no.
>>> 06885462, with its registered office at 1 Midland Road London NW1 1AT
>>>
>>> _______________________________________________
>>> docktesters mailing list
>>> docktesters at lists.icgc.org
>>> https://lists.icgc.org/mailman/listinfo/docktesters
>>>
>>>
>>
>> The Francis Crick Institute Limited is a registered charity in England
>> and Wales no. 1140062 and a company registered in England and Wales no.
>> 06885462, with its registered office at 1 Midland Road London NW1 1AT
>>
>> _______________________________________________
>> docktesters mailing list
>> docktesters at lists.icgc.org
>> https://lists.icgc.org/mailman/listinfo/docktesters
>>
>>
>
>
> _______________________________________________
> docktesters mailing listdocktesters at lists.icgc.orghttps://lists.icgc.org/mailman/listinfo/docktesters
>
>
>
> _______________________________________________
> docktesters mailing list
> docktesters at lists.icgc.org
> https://lists.icgc.org/mailman/listinfo/docktesters
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.icgc.org/mailman/private/docktesters/attachments/20170411/d320d71a/attachment-0001.html>


More information about the docktesters mailing list