[DOCKTESTERS] Preliminary results for overlap between testing and original VCF

Miguel Vazquez miguel.vazquez at cnio.es
Fri Nov 18 08:48:07 EST 2016


Hi again

I've done some more investigating and it turns out that there is a was
ignoring the quite obvious 'FILTER' tag. Silly me. Filtering now for
mutations that 'PASS' I get

Comparison
----------
Total original (dkfz): 16090
Total this: 16088




*Common: 16088Missing: 2. Example: 10:86361665:T, 3:168842417:GExtra: 0.
Example: *
Not a perfect match, but very close!!!!

Best

Miguel


On Fri, Nov 18, 2016 at 2:06 PM, Miguel Vazquez <miguel.vazquez at cnio.es>
wrote:

> Dear Francis and friends,
>
> Given that Francis was eager to see some inital estimates on how well the
> testing where in terms of overlap I have made some advances. Let me show
> you some of my initial results.
>
> For sample DO50311 with the pipeline from DKFZ (using Delly first to
> produce the BEDPE file) I get the following result:
>
>
>> *Comparison----------*
>> Total original (dkfz): 16090
>> Total this: 51087
>> *Common: 16090*
>> *Missing: 0*. Example:
>> *Extra: 34997*. Example: 1:10157:C, 1:725511:A, 1:725971:T, 1:726707:A
>>
>>
> Whit means that in the original VCF there are 16K mutations, all of them
> are found in our new VCF (this), however our new file contains 35K extra
> mutations. Listed are some examples of extra mutations, going back to our
> VCF here is a sample line
>
> #CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT
> CONTROL TUMOR
> 1       725971  .       G       T       .
> RE;BL;TAC;HSDEPTH;SBAF;FRQ;VAF  SOMATIC;SNP;AF=0.02,0.03;MQ=57
> GT:DP:DP4       0/0:115:60,53,2,0       0/0:114:49,62,0,3
>
> I take it this is a good result. Finding all the reported mutations is a
> great sign I think, and the extra mutations must be a filtering step that
> we need to account for. *I hope someone can point out from the VCF line
> above what is it that I need to use for the filtering.*
>
> The *VCF files I took from a file I have named
> 'preliminary_final_release.snvs.tgz' from May 30* that contains VCF file
> with the merged results from all callers. I simply subset the lines for
> each caller, in this case dkfz. Also the files are listed by aliquote so I
> have to translate the donor to aliquote ID. I've script this quickly using
> my Rbbt framework but I'll rewrite it all in bash and add it to my repo of
> testing scripts at https://github.com/mikisvaz/PCAWG-Docker-Test
>
> Summary of my progress
> -----------------------------------
>
> - Pipelines: DKFZ (works), Sanger (doesn't work. fixed?), BWM-Mem (not
> integrated; missing data-preparation step), Broad (??)
> - Donor integration: GNOS (works), IGCG (works)
> - Comparison: DKFZ (missing filtering?), rest (waiting)
>
> I have everything scripted so I can iterate a list of donors and download
> the data, run pipelines, erase data, compare results.
>
> Missing things on my ToDo list
> -------------------------------------------
>
> - Integrate BWM-Mem by incorporating the initial step to de-align the BAM
> files
> - Find a programmatic way to access the bundle-id files for each donor
> from ICGC data portal, righ now I have to go to the web page
> - Add filtering step to DKFZ and other pipelines as they become usable.
> - Change the scripting of the comparison to bash and add it to
> https://github.com/mikisvaz/PCAWG-Docker-Test
>
> Best regards to all
>
> Miguel
>
>
>
> On Tue, Nov 8, 2016 at 3:40 PM, Francis Ouellette <francis at oicr.on.ca>
> wrote:
>
>>
>> Anybody else on our poll for next call?
>> Looks like Friday at 11:00. I will close poll later today.
>>
>>
>> @bffo
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> docktesters mailing list
>> docktesters at lists.icgc.org
>> https://lists.icgc.org/mailman/listinfo/docktesters
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.icgc.org/mailman/private/docktesters/attachments/20161118/a35faa59/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot 2016-11-08 09.38.56.png
Type: image/png
Size: 219463 bytes
Desc: not available
URL: <https://lists.icgc.org/mailman/private/docktesters/attachments/20161118/a35faa59/attachment-0001.png>


More information about the docktesters mailing list