From Denis.Yuen at oicr.on.ca Wed Mar 1 10:26:01 2017
From: Denis.Yuen at oicr.on.ca (Denis Yuen)
Date: Wed, 1 Mar 2017 15:26:01 +0000
Subject: [DOCKTESTERS] Thanks!
Message-ID: <26d1914151c94301bcc761ef88aaa011@oicr.on.ca>
Hi,
Just wanted to say thanks to Miguel and Jonas for keeping the workflow testing data page up-to-date.
https://wiki.oicr.on.ca/display/PANCANCER/Workflow+Testing+Data
As we work on new versions or debugging, it is invaluable to know what versions of the workflows have worked outside OICR, thanks!
Denis Yuen
Senior Software Developer
Ontario Institute for Cancer Research
MaRS Centre
661 University Avenue
Suite 510
Toronto, Ontario, Canada M5G 0A3
Toll-free: 1-866-678-6427
Twitter: @OICR_news
www.oicr.on.ca
This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From christina.yung at oicr.on.ca Mon Mar 6 09:43:44 2017
From: christina.yung at oicr.on.ca (Christina Yung)
Date: Mon, 6 Mar 2017 08:43:44 -0600
Subject: [DOCKTESTERS] Fwd: PCAWG-TECH Author Form
In-Reply-To: <6d28eeef5e2d47a1ba82ea42a3013ff8@oicr.on.ca>
References: <6d28eeef5e2d47a1ba82ea42a3013ff8@oicr.on.ca>
Message-ID: <6a9200eb-1317-e2cd-d1c6-372defff08c4@oicr.on.ca>
An HTML attachment was scrubbed...
URL:
From Junjun.Zhang at oicr.on.ca Fri Mar 10 15:51:53 2017
From: Junjun.Zhang at oicr.on.ca (Junjun Zhang)
Date: Fri, 10 Mar 2017 20:51:53 +0000
Subject: [DOCKTESTERS] Thanks!
Message-ID:
Dear Docktesters,
George Mihaiescu, cloud architect, of the Collaboratory at OICR plans to run some bioinformatics workflows to test Collab environment.
Just thought this is a good opportunity to use as extra help for testing out the PCAWG dockerized workflows.
Miguel, Denis and others, what workflows / datasets do you think would be good for George to run?
Thanks,
Junjun
From: > on behalf of Denis Yuen >
Date: Wednesday, March 1, 2017 at 10:26 AM
To: "docktesters at lists.icgc.org" >
Subject: [DOCKTESTERS] Thanks!
Hi,
Just wanted to say thanks to Miguel and Jonas for keeping the workflow testing data page up-to-date.
https://wiki.oicr.on.ca/display/PANCANCER/Workflow+Testing+Data
As we work on new versions or debugging, it is invaluable to know what versions of the workflows have worked outside OICR, thanks!
Denis Yuen
Senior Software Developer
OntarioInstituteforCancerResearch
MaRSCentre
661 University Avenue
Suite510
Toronto, Ontario,Canada M5G0A3
Toll-free: 1-866-678-6427
Twitter: @OICR_news
www.oicr.on.ca
This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From miguel.vazquez at cnio.es Sat Mar 11 10:57:23 2017
From: miguel.vazquez at cnio.es (Miguel Vazquez)
Date: Sat, 11 Mar 2017 16:57:23 +0100
Subject: [DOCKTESTERS] Thanks!
In-Reply-To:
References:
Message-ID:
Hi Junjun,
I think Jonas has been using my scripts to run some of the tests, maybe
George could try them as well, it should be very easy for him to try the
Sanger, Delly+DKFZ, BWA-Mem, and the BiasFilter.
https://github.com/mikisvaz/PCAWG-Docker-Test
He would just need to update the tokens for DACO access and the scripts
will take care of downloading the BAM files, running the workflows and
evaluating the result.
The documentation there is reasonably updated, but if this sounds good then
perhaps he could contact me and I could walk him through the details.
Best regards
Miguel
On Fri, Mar 10, 2017 at 9:51 PM, Junjun Zhang
wrote:
> Dear Docktesters,
>
> George Mihaiescu, cloud architect, of the Collaboratory at OICR plans to
> run some bioinformatics workflows to test Collab environment.
>
> Just thought this is a good opportunity to use as extra help for testing
> out the PCAWG dockerized workflows.
>
> Miguel, Denis and others, what workflows / datasets do you think would be
> good for George to run?
>
> Thanks,
> Junjun
>
>
>
> From: on
> behalf of Denis Yuen
> Date: Wednesday, March 1, 2017 at 10:26 AM
> To: "docktesters at lists.icgc.org"
> Subject: [DOCKTESTERS] Thanks!
>
> Hi,
>
> Just wanted to say thanks to Miguel and Jonas for keeping the workflow
> testing data page up-to-date.
>
> https://wiki.oicr.on.ca/display/PANCANCER/Workflow+Testing+Data
>
>
> As we work on new versions or debugging, it is invaluable to know what
> versions of the workflows have worked outside OICR, thanks!
>
>
>
> *Denis Yuen*
> Senior Software Developer
>
>
> *Ontario**Institute**for**Cancer**Research*
> MaRSCentre
> 661 University Avenue
> Suite510
> Toronto, Ontario,Canada M5G0A3
>
> Toll-free: 1-866-678-6427
> Twitter: @OICR_news
> *www.oicr.on.ca *
>
> This message and any attachments may contain confidential and/or
> privileged information for the sole use of the intended recipient. Any
> review or distribution by anyone other than the person for whom it was
> originally intended is strictly prohibited. If you have received this
> message in error, please contact the sender and delete all copies.
> Opinions, conclusions or other information contained in this message may
> not be that of the organization.
>
>
> _______________________________________________
> docktesters mailing list
> docktesters at lists.icgc.org
> https://lists.icgc.org/mailman/listinfo/docktesters
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From George.Mihaiescu at oicr.on.ca Sat Mar 11 11:00:10 2017
From: George.Mihaiescu at oicr.on.ca (George Mihaiescu)
Date: Sat, 11 Mar 2017 16:00:10 +0000
Subject: [DOCKTESTERS] Thanks!
In-Reply-To:
Message-ID:
Sure, I'll give it a try and report later.
Thank you,
George Mihaiescu
Senior Cloud Architect
Ontario Institute for Cancer Research
MaRS Centre
661 University Avenue
Suite 510
Toronto, Ontario
Canada M5G 0A3
Email: George.Mihaiescu at oicr.on.ca
Toll-free: 1-866-678-6427
Twitter: @OICR_news
www.oicr.on.ca
This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization.
From: Miguel Vazquez >
Date: Saturday, March 11, 2017 at 10:57 AM
To: Junjun Zhang >
Cc: Denis Yuen >, Jonas Demeulemeester >, George Mihaiescu >, "docktesters at lists.icgc.org" >
Subject: Re: [DOCKTESTERS] Thanks!
Hi Junjun,
I think Jonas has been using my scripts to run some of the tests, maybe George could try them as well, it should be very easy for him to try the Sanger, Delly+DKFZ, BWA-Mem, and the BiasFilter.
https://github.com/mikisvaz/PCAWG-Docker-Test
He would just need to update the tokens for DACO access and the scripts will take care of downloading the BAM files, running the workflows and evaluating the result.
The documentation there is reasonably updated, but if this sounds good then perhaps he could contact me and I could walk him through the details.
Best regards
Miguel
On Fri, Mar 10, 2017 at 9:51 PM, Junjun Zhang > wrote:
Dear Docktesters,
George Mihaiescu, cloud architect, of the Collaboratory at OICR plans to run some bioinformatics workflows to test Collab environment.
Just thought this is a good opportunity to use as extra help for testing out the PCAWG dockerized workflows.
Miguel, Denis and others, what workflows / datasets do you think would be good for George to run?
Thanks,
Junjun
From: > on behalf of Denis Yuen >
Date: Wednesday, March 1, 2017 at 10:26 AM
To: "docktesters at lists.icgc.org" >
Subject: [DOCKTESTERS] Thanks!
Hi,
Just wanted to say thanks to Miguel and Jonas for keeping the workflow testing data page up-to-date.
https://wiki.oicr.on.ca/display/PANCANCER/Workflow+Testing+Data
As we work on new versions or debugging, it is invaluable to know what versions of the workflows have worked outside OICR, thanks!
Denis Yuen
Senior Software Developer
OntarioInstituteforCancerResearch
MaRSCentre
661 University Avenue
Suite510
Toronto, Ontario,Canada M5G0A3
Toll-free: 1-866-678-6427
Twitter: @OICR_news
www.oicr.on.ca
This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization.
_______________________________________________
docktesters mailing list
docktesters at lists.icgc.org
https://lists.icgc.org/mailman/listinfo/docktesters
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From Jonas.Demeulemeester at crick.ac.uk Sat Mar 11 18:15:14 2017
From: Jonas.Demeulemeester at crick.ac.uk (Jonas Demeulemeester)
Date: Sat, 11 Mar 2017 23:15:14 +0000
Subject: [DOCKTESTERS] Thanks!
In-Reply-To:
References: ,
Message-ID: <570FCD5C-E577-4CBA-A741-7ADC562CFB65@crick.ac.uk>
Hi George,
Yup, I've been running the PCAWG dockers mainly using Miguel's set of scripts.
Give them a go and if you run into issues, just let us know!
Cheers,
Jonas
On 11 Mar 2017, at 17:00, George Mihaiescu > wrote:
Sure, I'll give it a try and report later.
Thank you,
George Mihaiescu
Senior Cloud Architect
Ontario Institute for Cancer Research
MaRS Centre
661 University Avenue
Suite 510
Toronto, Ontario
Canada M5G 0A3
Email: George.Mihaiescu at oicr.on.ca
Toll-free: 1-866-678-6427
Twitter: @OICR_news
www.oicr.on.ca
This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization.
From: Miguel Vazquez >
Date: Saturday, March 11, 2017 at 10:57 AM
To: Junjun Zhang >
Cc: Denis Yuen >, Jonas Demeulemeester >, George Mihaiescu >, "docktesters at lists.icgc.org" >
Subject: Re: [DOCKTESTERS] Thanks!
Hi Junjun,
I think Jonas has been using my scripts to run some of the tests, maybe George could try them as well, it should be very easy for him to try the Sanger, Delly+DKFZ, BWA-Mem, and the BiasFilter.
https://github.com/mikisvaz/PCAWG-Docker-Test
He would just need to update the tokens for DACO access and the scripts will take care of downloading the BAM files, running the workflows and evaluating the result.
The documentation there is reasonably updated, but if this sounds good then perhaps he could contact me and I could walk him through the details.
Best regards
Miguel
On Fri, Mar 10, 2017 at 9:51 PM, Junjun Zhang > wrote:
Dear Docktesters,
George Mihaiescu, cloud architect, of the Collaboratory at OICR plans to run some bioinformatics workflows to test Collab environment.
Just thought this is a good opportunity to use as extra help for testing out the PCAWG dockerized workflows.
Miguel, Denis and others, what workflows / datasets do you think would be good for George to run?
Thanks,
Junjun
From: > on behalf of Denis Yuen >
Date: Wednesday, March 1, 2017 at 10:26 AM
To: "docktesters at lists.icgc.org" >
Subject: [DOCKTESTERS] Thanks!
Hi,
Just wanted to say thanks to Miguel and Jonas for keeping the workflow testing data page up-to-date.
https://wiki.oicr.on.ca/display/PANCANCER/Workflow+Testing+Data
As we work on new versions or debugging, it is invaluable to know what versions of the workflows have worked outside OICR, thanks!
Denis Yuen
Senior Software Developer
OntarioInstituteforCancerResearch
MaRSCentre
661 University Avenue
Suite510
Toronto, Ontario,Canada M5G0A3
Toll-free: 1-866-678-6427
Twitter: @OICR_news
www.oicr.on.ca
This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization.
_______________________________________________
docktesters mailing list
docktesters at lists.icgc.org
https://lists.icgc.org/mailman/listinfo/docktesters
The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 1 Midland Road London NW1 1AT
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From Junjun.Zhang at oicr.on.ca Sun Mar 12 23:45:14 2017
From: Junjun.Zhang at oicr.on.ca (Junjun Zhang)
Date: Mon, 13 Mar 2017 03:45:14 +0000
Subject: [DOCKTESTERS] Thanks!
In-Reply-To: <570FCD5C-E577-4CBA-A741-7ADC562CFB65@crick.ac.uk>
References:
<570FCD5C-E577-4CBA-A741-7ADC562CFB65@crick.ac.uk>
Message-ID:
Thanks Miguel and Jonas for your help here!
Do you have any update on the latest testing? Please feel free updating the wiki with any update: https://wiki.oicr.on.ca/display/PANCANCER/2017-03-13+PCAWG-TECH+Teleconference
Regards,
Junjun
From: Jonas Demeulemeester >
Date: Saturday, March 11, 2017 at 7:15 PM
To: George Mihaiescu >
Cc: Miguel Vazquez >, Junjun Zhang >, Denis Yuen >, "docktesters at lists.icgc.org" >
Subject: Re: [DOCKTESTERS] Thanks!
Hi George,
Yup, I've been running the PCAWG dockers mainly using Miguel's set of scripts.
Give them a go and if you run into issues, just let us know!
Cheers,
Jonas
On 11 Mar 2017, at 17:00, George Mihaiescu > wrote:
Sure, I'll give it a try and report later.
Thank you,
George Mihaiescu
Senior Cloud Architect
Ontario Institute for Cancer Research
MaRS Centre
661 University Avenue
Suite 510
Toronto, Ontario
Canada M5G 0A3
Email: George.Mihaiescu at oicr.on.ca
Toll-free: 1-866-678-6427
Twitter: @OICR_news
www.oicr.on.ca
This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization.
From: Miguel Vazquez >
Date: Saturday, March 11, 2017 at 10:57 AM
To: Junjun Zhang >
Cc: Denis Yuen >, Jonas Demeulemeester >, George Mihaiescu >, "docktesters at lists.icgc.org" >
Subject: Re: [DOCKTESTERS] Thanks!
Hi Junjun,
I think Jonas has been using my scripts to run some of the tests, maybe George could try them as well, it should be very easy for him to try the Sanger, Delly+DKFZ, BWA-Mem, and the BiasFilter.
https://github.com/mikisvaz/PCAWG-Docker-Test
He would just need to update the tokens for DACO access and the scripts will take care of downloading the BAM files, running the workflows and evaluating the result.
The documentation there is reasonably updated, but if this sounds good then perhaps he could contact me and I could walk him through the details.
Best regards
Miguel
On Fri, Mar 10, 2017 at 9:51 PM, Junjun Zhang > wrote:
Dear Docktesters,
George Mihaiescu, cloud architect, of the Collaboratory at OICR plans to run some bioinformatics workflows to test Collab environment.
Just thought this is a good opportunity to use as extra help for testing out the PCAWG dockerized workflows.
Miguel, Denis and others, what workflows / datasets do you think would be good for George to run?
Thanks,
Junjun
From: > on behalf of Denis Yuen >
Date: Wednesday, March 1, 2017 at 10:26 AM
To: "docktesters at lists.icgc.org" >
Subject: [DOCKTESTERS] Thanks!
Hi,
Just wanted to say thanks to Miguel and Jonas for keeping the workflow testing data page up-to-date.
https://wiki.oicr.on.ca/display/PANCANCER/Workflow+Testing+Data
As we work on new versions or debugging, it is invaluable to know what versions of the workflows have worked outside OICR, thanks!
Denis Yuen
Senior Software Developer
OntarioInstituteforCancerResearch
MaRSCentre
661 University Avenue
Suite510
Toronto, Ontario,Canada M5G0A3
Toll-free: 1-866-678-6427
Twitter: @OICR_news
www.oicr.on.ca
This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization.
_______________________________________________
docktesters mailing list
docktesters at lists.icgc.org
https://lists.icgc.org/mailman/listinfo/docktesters
The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 1 Midland Road London NW1 1AT
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From George.Mihaiescu at oicr.on.ca Mon Mar 13 00:12:09 2017
From: George.Mihaiescu at oicr.on.ca (George Mihaiescu)
Date: Mon, 13 Mar 2017 04:12:09 +0000
Subject: [DOCKTESTERS] Thanks!
In-Reply-To:
Message-ID:
Hi,
I've started Sanger on DO50398 and it's been running for more than 24 hours, currently at "Workflow step succeeded: s58_bbAllele_merge_59"
I just started a second run on a different VM on same donor, just to compare run times.
The VM used has 8 cores, 48 GB of RAM and 1.1 TB disk and I'll send some monitoring graphs when it finishes the workflow, but I have no idea how to check its correctness.
Give me a list of donors and what workflows you want me to run and I'll try to schedule them tomorrow.
George
From: Junjun Zhang >
Date: Sunday, March 12, 2017 at 10:45 PM
To: Jonas Demeulemeester >, George Mihaiescu >
Cc: Miguel Vazquez >, Denis Yuen >, "docktesters at lists.icgc.org" >
Subject: Re: [DOCKTESTERS] Thanks!
Thanks Miguel and Jonas for your help here!
Do you have any update on the latest testing? Please feel free updating the wiki with any update: https://wiki.oicr.on.ca/display/PANCANCER/2017-03-13+PCAWG-TECH+Teleconference
Regards,
Junjun
From: Jonas Demeulemeester >
Date: Saturday, March 11, 2017 at 7:15 PM
To: George Mihaiescu >
Cc: Miguel Vazquez >, Junjun Zhang >, Denis Yuen >, "docktesters at lists.icgc.org" >
Subject: Re: [DOCKTESTERS] Thanks!
Hi George,
Yup, I've been running the PCAWG dockers mainly using Miguel's set of scripts.
Give them a go and if you run into issues, just let us know!
Cheers,
Jonas
On 11 Mar 2017, at 17:00, George Mihaiescu > wrote:
Sure, I'll give it a try and report later.
Thank you,
George Mihaiescu
Senior Cloud Architect
Ontario Institute for Cancer Research
MaRS Centre
661 University Avenue
Suite 510
Toronto, Ontario
Canada M5G 0A3
Email: George.Mihaiescu at oicr.on.ca
Toll-free: 1-866-678-6427
Twitter: @OICR_news
www.oicr.on.ca
This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization.
From: Miguel Vazquez >
Date: Saturday, March 11, 2017 at 10:57 AM
To: Junjun Zhang >
Cc: Denis Yuen >, Jonas Demeulemeester >, George Mihaiescu >, "docktesters at lists.icgc.org" >
Subject: Re: [DOCKTESTERS] Thanks!
Hi Junjun,
I think Jonas has been using my scripts to run some of the tests, maybe George could try them as well, it should be very easy for him to try the Sanger, Delly+DKFZ, BWA-Mem, and the BiasFilter.
https://github.com/mikisvaz/PCAWG-Docker-Test
He would just need to update the tokens for DACO access and the scripts will take care of downloading the BAM files, running the workflows and evaluating the result.
The documentation there is reasonably updated, but if this sounds good then perhaps he could contact me and I could walk him through the details.
Best regards
Miguel
On Fri, Mar 10, 2017 at 9:51 PM, Junjun Zhang > wrote:
Dear Docktesters,
George Mihaiescu, cloud architect, of the Collaboratory at OICR plans to run some bioinformatics workflows to test Collab environment.
Just thought this is a good opportunity to use as extra help for testing out the PCAWG dockerized workflows.
Miguel, Denis and others, what workflows / datasets do you think would be good for George to run?
Thanks,
Junjun
From: > on behalf of Denis Yuen >
Date: Wednesday, March 1, 2017 at 10:26 AM
To: "docktesters at lists.icgc.org" >
Subject: [DOCKTESTERS] Thanks!
Hi,
Just wanted to say thanks to Miguel and Jonas for keeping the workflow testing data page up-to-date.
https://wiki.oicr.on.ca/display/PANCANCER/Workflow+Testing+Data
As we work on new versions or debugging, it is invaluable to know what versions of the workflows have worked outside OICR, thanks!
Denis Yuen
Senior Software Developer
OntarioInstituteforCancerResearch
MaRSCentre
661 University Avenue
Suite510
Toronto, Ontario,Canada M5G0A3
Toll-free: 1-866-678-6427
Twitter: @OICR_news
www.oicr.on.ca
This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization.
_______________________________________________
docktesters mailing list
docktesters at lists.icgc.org
https://lists.icgc.org/mailman/listinfo/docktesters
The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 1 Midland Road London NW1 1AT
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From mikisvaz at gmail.com Mon Mar 13 07:53:03 2017
From: mikisvaz at gmail.com (Miguel Vazquez)
Date: Mon, 13 Mar 2017 12:53:03 +0100
Subject: [DOCKTESTERS] Thanks!
In-Reply-To:
References:
Message-ID:
Hi George,
The Sanger workflow is very lengthy, it takes about two weeks in my tests.
About correctness, my scripts also cover that part, if you are not using
them they might still help you to clarify how we do it. The idea is to take
each of the output files produced: SNV_MNV, Indel, SV, and CNV, for both
germline and somatic and compare it with the result uploaded to GNOS (not
all pipelines produce all files). This is the relevant part in the
run_batch.sh script:
https://github.com/mikisvaz/PCAWG-Docker-Test/blob/master/
bin/run_batch.sh#L42-L46
The bin/compare_result_type.sh script will take care of downloading the
correct file from GNOS and running the comparison. The comparison itself is
simple since all files are VCFs, it consists in taking out the variants in
terms of chromosome, position, reference and alternative allele and
measuring the overlaps.
https://github.com/mikisvaz/PCAWG-Docker-Test/blob/master/bin/compare_result_type.sh
About which donors to test, DO52140 is one Jonas and I have both tested and
could be interesting to get a third opinion. Also, any other donor could be
interesting to see if something new comes up. I'm not sure which options is
best.
Miguel
On Mon, Mar 13, 2017 at 5:12 AM, George Mihaiescu <
George.Mihaiescu at oicr.on.ca> wrote:
> Hi,
>
> I've started Sanger on DO50398 and it's been running for more than 24
> hours, currently at "Workflow step succeeded: s58_bbAllele_merge_59"
>
> I just started a second run on a different VM on same donor, just to
> compare run times.
> The VM used has 8 cores, 48 GB of RAM and 1.1 TB disk and I'll send some
> monitoring graphs when it finishes the workflow, but I have no idea how to
> check its correctness.
>
> Give me a list of donors and what workflows you want me to run and I'll
> try to schedule them tomorrow.
>
> George
>
>
> From: Junjun Zhang
> Date: Sunday, March 12, 2017 at 10:45 PM
> To: Jonas Demeulemeester , George
> Mihaiescu
> Cc: Miguel Vazquez , Denis Yuen <
> Denis.Yuen at oicr.on.ca>, "docktesters at lists.icgc.org" <
> docktesters at lists.icgc.org>
> Subject: Re: [DOCKTESTERS] Thanks!
>
> Thanks Miguel and Jonas for your help here!
>
> Do you have any update on the latest testing? Please feel free updating
> the wiki with any update: https://wiki.oicr.on.
> ca/display/PANCANCER/2017-03-13+PCAWG-TECH+Teleconference
>
> Regards,
> Junjun
>
>
>
> From: Jonas Demeulemeester
> Date: Saturday, March 11, 2017 at 7:15 PM
> To: George Mihaiescu
> Cc: Miguel Vazquez , Junjun Zhang <
> junjun.zhang at oicr.on.ca>, Denis Yuen , "
> docktesters at lists.icgc.org"
> Subject: Re: [DOCKTESTERS] Thanks!
>
> Hi George,
>
> Yup, I've been running the PCAWG dockers mainly using Miguel's set of
> scripts.
> Give them a go and if you run into issues, just let us know!
>
> Cheers,
> Jonas
>
>
> On 11 Mar 2017, at 17:00, George Mihaiescu
> wrote:
>
> Sure, I'll give it a try and report later.
>
> Thank you,
>
> *George Mihaiescu*
> Senior Cloud Architect
>
> *Ontario Institute for Cancer Research*
> MaRS Centre
> 661 University Avenue
> Suite 510
> Toronto, Ontario
> Canada M5G 0A3
>
> Email: George.Mihaiescu at oicr.on.ca
> Toll-free: 1-866-678-6427
> Twitter: @OICR_news
>
> www.oicr.on.ca
>
> This message and any attachments may contain confidential and/or
> privileged information for the sole use of the intended recipient. Any
> review or distribution by anyone other than the person for whom it was
> originally intended is strictly prohibited. If you have received this
> message in error, please contact the sender and delete all copies.
> Opinions, conclusions or other information contained in this message may
> not be that of the organization.
>
>
>
> From: Miguel Vazquez
> Date: Saturday, March 11, 2017 at 10:57 AM
> To: Junjun Zhang
> Cc: Denis Yuen , Jonas Demeulemeester <
> jonas.demeulemeester at crick.ac.uk>, George Mihaiescu <
> George.Mihaiescu at oicr.on.ca>, "docktesters at lists.icgc.org" <
> docktesters at lists.icgc.org>
> Subject: Re: [DOCKTESTERS] Thanks!
>
> Hi Junjun,
>
> I think Jonas has been using my scripts to run some of the tests, maybe
> George could try them as well, it should be very easy for him to try the
> Sanger, Delly+DKFZ, BWA-Mem, and the BiasFilter.
>
> https://github.com/mikisvaz/PCAWG-Docker-Test
>
> He would just need to update the tokens for DACO access and the scripts
> will take care of downloading the BAM files, running the workflows and
> evaluating the result.
>
> The documentation there is reasonably updated, but if this sounds good
> then perhaps he could contact me and I could walk him through the details.
>
> Best regards
>
> Miguel
>
> On Fri, Mar 10, 2017 at 9:51 PM, Junjun Zhang
> wrote:
>
>> Dear Docktesters,
>>
>> George Mihaiescu, cloud architect, of the Collaboratory at OICR plans to
>> run some bioinformatics workflows to test Collab environment.
>>
>> Just thought this is a good opportunity to use as extra help for testing
>> out the PCAWG dockerized workflows.
>>
>> Miguel, Denis and others, what workflows / datasets do you think would be
>> good for George to run?
>>
>> Thanks,
>> Junjun
>>
>>
>>
>> From: on
>> behalf of Denis Yuen
>> Date: Wednesday, March 1, 2017 at 10:26 AM
>> To: "docktesters at lists.icgc.org"
>> Subject: [DOCKTESTERS] Thanks!
>>
>> Hi,
>>
>> Just wanted to say thanks to Miguel and Jonas for keeping the workflow
>> testing data page up-to-date.
>>
>> https://wiki.oicr.on.ca/display/PANCANCER/Workflow+Testing+Data
>>
>>
>> As we work on new versions or debugging, it is invaluable to know what
>> versions of the workflows have worked outside OICR, thanks!
>>
>>
>>
>> *Denis Yuen*
>> Senior Software Developer
>>
>>
>> *Ontario**Institute**for**Cancer**Research*
>> MaRSCentre
>> 661 University Avenue
>> Suite510
>> Toronto, Ontario,Canada M5G0A3
>>
>> Toll-free: 1-866-678-6427
>> Twitter: @OICR_news
>> *www.oicr.on.ca *
>>
>> This message and any attachments may contain confidential and/or
>> privileged information for the sole use of the intended recipient. Any
>> review or distribution by anyone other than the person for whom it was
>> originally intended is strictly prohibited. If you have received this
>> message in error, please contact the sender and delete all copies.
>> Opinions, conclusions or other information contained in this message may
>> not be that of the organization.
>>
>>
>> _______________________________________________
>> docktesters mailing list
>> docktesters at lists.icgc.org
>> https://lists.icgc.org/mailman/listinfo/docktesters
>>
>>
> The Francis Crick Institute Limited is a registered charity in England and
> Wales no. 1140062 and a company registered in England and Wales no.
> 06885462, with its registered office at 1 Midland Road London NW1 1AT
>
>
> _______________________________________________
> docktesters mailing list
> docktesters at lists.icgc.org
> https://lists.icgc.org/mailman/listinfo/docktesters
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From George.Mihaiescu at oicr.on.ca Mon Mar 13 09:43:59 2017
From: George.Mihaiescu at oicr.on.ca (George Mihaiescu)
Date: Mon, 13 Mar 2017 13:43:59 +0000
Subject: [DOCKTESTERS] Thanks!
In-Reply-To:
Message-ID:
Hi Miguel,
I've started the test by running "bin/run_test.sh Sanger DO50398", so I guess with just one workflow running it should complete faster than two weeks.
Because I'm running in Collaboratory I've changed the "get_icgc_donor.sh" script to use a docker container that has the icgc client inside and pull data from Collaboratory. There is no "bam.bas" file downloaded, just a ".bam" and a ".bam.bai" files, not sure if this is an issue.
By looking at the "bin/compare_result_type.sh" it looks like it's using the gnos client to pull down the existing VCF files for comparison reasons, but I think we store those files in Collaboratory as well, so I'll work with Junjun to adapt the script for this.
I think I initially tried to run the DKFZ workflow, but it complained about having to run Delly first, so I abandoned this for now.
I'll set up a new VM and run the "run_batch.sh" on the DO52140 donor.
George
From: Miguel Vazquez >
Date: Monday, March 13, 2017 at 6:53 AM
To: George Mihaiescu >
Cc: Junjun Zhang >, Jonas Demeulemeester >, "docktesters at lists.icgc.org" >
Subject: Re: [DOCKTESTERS] Thanks!
Hi George,
The Sanger workflow is very lengthy, it takes about two weeks in my tests.
About correctness, my scripts also cover that part, if you are not using them they might still help you to clarify how we do it. The idea is to take each of the output files produced: SNV_MNV, Indel, SV, and CNV, for both germline and somatic and compare it with the result uploaded to GNOS (not all pipelines produce all files). This is the relevant part in the run_batch.sh script:
https://github.com/mikisvaz/PCAWG-Docker-Test/blob/master/bin/run_batch.sh#L42-L46
The bin/compare_result_type.sh script will take care of downloading the correct file from GNOS and running the comparison. The comparison itself is simple since all files are VCFs, it consists in taking out the variants in terms of chromosome, position, reference and alternative allele and measuring the overlaps.
https://github.com/mikisvaz/PCAWG-Docker-Test/blob/master/bin/compare_result_type.sh
About which donors to test, DO52140 is one Jonas and I have both tested and could be interesting to get a third opinion. Also, any other donor could be interesting to see if something new comes up. I'm not sure which options is best.
Miguel
On Mon, Mar 13, 2017 at 5:12 AM, George Mihaiescu > wrote:
Hi,
I've started Sanger on DO50398 and it's been running for more than 24 hours, currently at "Workflow step succeeded: s58_bbAllele_merge_59"
I just started a second run on a different VM on same donor, just to compare run times.
The VM used has 8 cores, 48 GB of RAM and 1.1 TB disk and I'll send some monitoring graphs when it finishes the workflow, but I have no idea how to check its correctness.
Give me a list of donors and what workflows you want me to run and I'll try to schedule them tomorrow.
George
From: Junjun Zhang >
Date: Sunday, March 12, 2017 at 10:45 PM
To: Jonas Demeulemeester >, George Mihaiescu >
Cc: Miguel Vazquez >, Denis Yuen >, "docktesters at lists.icgc.org" >
Subject: Re: [DOCKTESTERS] Thanks!
Thanks Miguel and Jonas for your help here!
Do you have any update on the latest testing? Please feel free updating the wiki with any update: https://wiki.oicr.on.ca/display/PANCANCER/2017-03-13+PCAWG-TECH+Teleconference
Regards,
Junjun
From: Jonas Demeulemeester >
Date: Saturday, March 11, 2017 at 7:15 PM
To: George Mihaiescu >
Cc: Miguel Vazquez >, Junjun Zhang >, Denis Yuen >, "docktesters at lists.icgc.org" >
Subject: Re: [DOCKTESTERS] Thanks!
Hi George,
Yup, I've been running the PCAWG dockers mainly using Miguel's set of scripts.
Give them a go and if you run into issues, just let us know!
Cheers,
Jonas
On 11 Mar 2017, at 17:00, George Mihaiescu > wrote:
Sure, I'll give it a try and report later.
Thank you,
George Mihaiescu
Senior Cloud Architect
Ontario Institute for Cancer Research
MaRS Centre
661 University Avenue
Suite 510
Toronto, Ontario
Canada M5G 0A3
Email: George.Mihaiescu at oicr.on.ca
Toll-free: 1-866-678-6427
Twitter: @OICR_news
www.oicr.on.ca
This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization.
From: Miguel Vazquez >
Date: Saturday, March 11, 2017 at 10:57 AM
To: Junjun Zhang >
Cc: Denis Yuen >, Jonas Demeulemeester >, George Mihaiescu >, "docktesters at lists.icgc.org" >
Subject: Re: [DOCKTESTERS] Thanks!
Hi Junjun,
I think Jonas has been using my scripts to run some of the tests, maybe George could try them as well, it should be very easy for him to try the Sanger, Delly+DKFZ, BWA-Mem, and the BiasFilter.
https://github.com/mikisvaz/PCAWG-Docker-Test
He would just need to update the tokens for DACO access and the scripts will take care of downloading the BAM files, running the workflows and evaluating the result.
The documentation there is reasonably updated, but if this sounds good then perhaps he could contact me and I could walk him through the details.
Best regards
Miguel
On Fri, Mar 10, 2017 at 9:51 PM, Junjun Zhang > wrote:
Dear Docktesters,
George Mihaiescu, cloud architect, of the Collaboratory at OICR plans to run some bioinformatics workflows to test Collab environment.
Just thought this is a good opportunity to use as extra help for testing out the PCAWG dockerized workflows.
Miguel, Denis and others, what workflows / datasets do you think would be good for George to run?
Thanks,
Junjun
From: > on behalf of Denis Yuen >
Date: Wednesday, March 1, 2017 at 10:26 AM
To: "docktesters at lists.icgc.org" >
Subject: [DOCKTESTERS] Thanks!
Hi,
Just wanted to say thanks to Miguel and Jonas for keeping the workflow testing data page up-to-date.
https://wiki.oicr.on.ca/display/PANCANCER/Workflow+Testing+Data
As we work on new versions or debugging, it is invaluable to know what versions of the workflows have worked outside OICR, thanks!
Denis Yuen
Senior Software Developer
OntarioInstituteforCancerResearch
MaRSCentre
661 University Avenue
Suite510
Toronto, Ontario,Canada M5G0A3
Toll-free: 1-866-678-6427
Twitter: @OICR_news
www.oicr.on.ca
This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization.
_______________________________________________
docktesters mailing list
docktesters at lists.icgc.org
https://lists.icgc.org/mailman/listinfo/docktesters
The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 1 Midland Road London NW1 1AT
_______________________________________________
docktesters mailing list
docktesters at lists.icgc.org
https://lists.icgc.org/mailman/listinfo/docktesters
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From mikisvaz at gmail.com Mon Mar 13 09:52:03 2017
From: mikisvaz at gmail.com (Miguel Vazquez)
Date: Mon, 13 Mar 2017 14:52:03 +0100
Subject: [DOCKTESTERS] Thanks!
In-Reply-To:
References:
Message-ID:
Hi George,
Answers inline
On Mon, Mar 13, 2017 at 2:43 PM, George Mihaiescu <
George.Mihaiescu at oicr.on.ca> wrote:
> Hi Miguel,
>
> I've started the test by running "bin/run_test.sh Sanger DO50398", so I
> guess with just one workflow running it should complete faster than two
> weeks.
>
I think it still should take a long time. My scripts will run one workflow
after another.
>
> Because I'm running in Collaboratory I've changed the "get_icgc_donor.sh"
> script to use a docker container that has the icgc client inside and pull
> data from Collaboratory. There is no "bam.bas" file downloaded, just a
> ".bam" and a ".bam.bai" files, not sure if this is an issue.
>
>
I wondered the same thing first time I did this, but this file is produced
by the pipeline. There was some problem with this that was dealt with by
the developers and updated in the docker. So I think you won't have a
problem
> By looking at the "bin/compare_result_type.sh" it looks like it's using
> the gnos client to pull down the existing VCF files for comparison reasons,
> but I think we store those files in Collaboratory as well, so I'll work
> with Junjun to adapt the script for this.
>
>
Let me know if you need any help
> I think I initially tried to run the DKFZ workflow, but it complained
> about having to run Delly first, so I abandoned this for now.
>
Yes, if you look at the run_batch.sh you will see that when using DKFZ it
will always run Delly first. Delly prepares some files the the DKFZ file
needs, namely related to copy number I believe.
>
> I'll set up a new VM and run the "run_batch.sh" on the DO52140 donor.
>
Remember that you will need to add the relevant has-keys for the different
files in the etc/donor_files.csv. Its a bit tedious right now. You need to
go to the ICGC DCC and find these codes manually for the files you need.
Ask me if you need help. Once you have all you can run all the workflows
for that donor and evaluate results.
https://github.com/mikisvaz/PCAWG-Docker-Test/blob/master/etc/donor_files.csv
Regards
Miguel
>
> George
>
> From: Miguel Vazquez
> Date: Monday, March 13, 2017 at 6:53 AM
> To: George Mihaiescu
> Cc: Junjun Zhang , Jonas Demeulemeester <
> Jonas.Demeulemeester at crick.ac.uk>, "docktesters at lists.icgc.org" <
> docktesters at lists.icgc.org>
> Subject: Re: [DOCKTESTERS] Thanks!
>
> Hi George,
>
> The Sanger workflow is very lengthy, it takes about two weeks in my tests.
>
> About correctness, my scripts also cover that part, if you are not using
> them they might still help you to clarify how we do it. The idea is to take
> each of the output files produced: SNV_MNV, Indel, SV, and CNV, for both
> germline and somatic and compare it with the result uploaded to GNOS (not
> all pipelines produce all files). This is the relevant part in the
> run_batch.sh script:
>
> https://github.com/mikisvaz/PCAWG-Docker-Test/blob/master/bi
> n/run_batch.sh#L42-L46
>
> The bin/compare_result_type.sh script will take care of downloading the
> correct file from GNOS and running the comparison. The comparison itself is
> simple since all files are VCFs, it consists in taking out the variants in
> terms of chromosome, position, reference and alternative allele and
> measuring the overlaps.
>
> https://github.com/mikisvaz/PCAWG-Docker-Test/blob/master/
> bin/compare_result_type.sh
>
> About which donors to test, DO52140 is one Jonas and I have both tested
> and could be interesting to get a third opinion. Also, any other donor
> could be interesting to see if something new comes up. I'm not sure which
> options is best.
>
> Miguel
>
>
>
>
> On Mon, Mar 13, 2017 at 5:12 AM, George Mihaiescu <
> George.Mihaiescu at oicr.on.ca> wrote:
>
>> Hi,
>>
>> I've started Sanger on DO50398 and it's been running for more than 24
>> hours, currently at "Workflow step succeeded: s58_bbAllele_merge_59"
>>
>> I just started a second run on a different VM on same donor, just to
>> compare run times.
>> The VM used has 8 cores, 48 GB of RAM and 1.1 TB disk and I'll send some
>> monitoring graphs when it finishes the workflow, but I have no idea how to
>> check its correctness.
>>
>> Give me a list of donors and what workflows you want me to run and I'll
>> try to schedule them tomorrow.
>>
>> George
>>
>>
>> From: Junjun Zhang
>> Date: Sunday, March 12, 2017 at 10:45 PM
>> To: Jonas Demeulemeester , George
>> Mihaiescu
>> Cc: Miguel Vazquez , Denis Yuen <
>> Denis.Yuen at oicr.on.ca>, "docktesters at lists.icgc.org" <
>> docktesters at lists.icgc.org>
>> Subject: Re: [DOCKTESTERS] Thanks!
>>
>> Thanks Miguel and Jonas for your help here!
>>
>> Do you have any update on the latest testing? Please feel free updating
>> the wiki with any update: https://wiki.oicr.on.c
>> a/display/PANCANCER/2017-03-13+PCAWG-TECH+Teleconference
>>
>> Regards,
>> Junjun
>>
>>
>>
>> From: Jonas Demeulemeester
>> Date: Saturday, March 11, 2017 at 7:15 PM
>> To: George Mihaiescu
>> Cc: Miguel Vazquez , Junjun Zhang <
>> junjun.zhang at oicr.on.ca>, Denis Yuen , "
>> docktesters at lists.icgc.org"
>> Subject: Re: [DOCKTESTERS] Thanks!
>>
>> Hi George,
>>
>> Yup, I've been running the PCAWG dockers mainly using Miguel's set of
>> scripts.
>> Give them a go and if you run into issues, just let us know!
>>
>> Cheers,
>> Jonas
>>
>>
>> On 11 Mar 2017, at 17:00, George Mihaiescu
>> wrote:
>>
>> Sure, I'll give it a try and report later.
>>
>> Thank you,
>>
>> *George Mihaiescu*
>> Senior Cloud Architect
>>
>> *Ontario Institute for Cancer Research*
>> MaRS Centre
>> 661 University Avenue
>> Suite 510
>> Toronto, Ontario
>> Canada M5G 0A3
>>
>> Email: George.Mihaiescu at oicr.on.ca
>> Toll-free: 1-866-678-6427
>> Twitter: @OICR_news
>>
>> www.oicr.on.ca
>>
>> This message and any attachments may contain confidential and/or
>> privileged information for the sole use of the intended recipient. Any
>> review or distribution by anyone other than the person for whom it was
>> originally intended is strictly prohibited. If you have received this
>> message in error, please contact the sender and delete all copies.
>> Opinions, conclusions or other information contained in this message may
>> not be that of the organization.
>>
>>
>>
>> From: Miguel Vazquez
>> Date: Saturday, March 11, 2017 at 10:57 AM
>> To: Junjun Zhang
>> Cc: Denis Yuen , Jonas Demeulemeester <
>> jonas.demeulemeester at crick.ac.uk>, George Mihaiescu <
>> George.Mihaiescu at oicr.on.ca>, "docktesters at lists.icgc.org" <
>> docktesters at lists.icgc.org>
>> Subject: Re: [DOCKTESTERS] Thanks!
>>
>> Hi Junjun,
>>
>> I think Jonas has been using my scripts to run some of the tests, maybe
>> George could try them as well, it should be very easy for him to try the
>> Sanger, Delly+DKFZ, BWA-Mem, and the BiasFilter.
>>
>> https://github.com/mikisvaz/PCAWG-Docker-Test
>>
>> He would just need to update the tokens for DACO access and the scripts
>> will take care of downloading the BAM files, running the workflows and
>> evaluating the result.
>>
>> The documentation there is reasonably updated, but if this sounds good
>> then perhaps he could contact me and I could walk him through the details.
>>
>> Best regards
>>
>> Miguel
>>
>> On Fri, Mar 10, 2017 at 9:51 PM, Junjun Zhang
>> wrote:
>>
>>> Dear Docktesters,
>>>
>>> George Mihaiescu, cloud architect, of the Collaboratory at OICR plans to
>>> run some bioinformatics workflows to test Collab environment.
>>>
>>> Just thought this is a good opportunity to use as extra help for testing
>>> out the PCAWG dockerized workflows.
>>>
>>> Miguel, Denis and others, what workflows / datasets do you think would
>>> be good for George to run?
>>>
>>> Thanks,
>>> Junjun
>>>
>>>
>>>
>>> From: on
>>> behalf of Denis Yuen
>>> Date: Wednesday, March 1, 2017 at 10:26 AM
>>> To: "docktesters at lists.icgc.org"
>>> Subject: [DOCKTESTERS] Thanks!
>>>
>>> Hi,
>>>
>>> Just wanted to say thanks to Miguel and Jonas for keeping the workflow
>>> testing data page up-to-date.
>>>
>>> https://wiki.oicr.on.ca/display/PANCANCER/Workflow+Testing+Data
>>>
>>>
>>> As we work on new versions or debugging, it is invaluable to know what
>>> versions of the workflows have worked outside OICR, thanks!
>>>
>>>
>>>
>>> *Denis Yuen*
>>> Senior Software Developer
>>>
>>>
>>> *Ontario**Institute**for**Cancer**Research*
>>> MaRSCentre
>>> 661 University Avenue
>>> Suite510
>>> Toronto, Ontario,Canada M5G0A3
>>>
>>> Toll-free: 1-866-678-6427
>>> Twitter: @OICR_news
>>> *www.oicr.on.ca *
>>>
>>> This message and any attachments may contain confidential and/or
>>> privileged information for the sole use of the intended recipient. Any
>>> review or distribution by anyone other than the person for whom it was
>>> originally intended is strictly prohibited. If you have received this
>>> message in error, please contact the sender and delete all copies.
>>> Opinions, conclusions or other information contained in this message may
>>> not be that of the organization.
>>>
>>>
>>> _______________________________________________
>>> docktesters mailing list
>>> docktesters at lists.icgc.org
>>> https://lists.icgc.org/mailman/listinfo/docktesters
>>>
>>>
>> The Francis Crick Institute Limited is a registered charity in England
>> and Wales no. 1140062 and a company registered in England and Wales no.
>> 06885462, with its registered office at 1 Midland Road London NW1 1AT
>>
>>
>> _______________________________________________
>> docktesters mailing list
>> docktesters at lists.icgc.org
>> https://lists.icgc.org/mailman/listinfo/docktesters
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From miguel.vazquez at cnio.es Mon Mar 13 11:22:23 2017
From: miguel.vazquez at cnio.es (Miguel Vazquez)
Date: Mon, 13 Mar 2017 16:22:23 +0100
Subject: [DOCKTESTERS] Help needed with DKFZ BiasFilter. Validation of
DO52140. 100% match is wrong!
Message-ID:
Dear all,
I just learnt that the DKFZ BiasFilter is NOT the OXOG filter workflow,
which means* I checked for the wrong thing in this validation!* I'm sorry
for the confusion.
Right now I pass the BAM files and the consensus.vcf (SNV_MNV) downloaded
from GNOS to the BiasFilter and compare the resulting VCF with the
consensus looking at the set of mutations containing the OXOGFAIL flag.
This apparently is not the comparison to make. *What is it that I need to
compare? is it the bPcr and bSeq flags?*
One first look at those flags do show quite some discrepancies
unfortunately on both donors (DO52140 and DO35937) for both flags. For
instance for DO35937 we find 11 mutations flaged bPcr with in the new
result, while the consensus.vcf only finds one, of them. Something similar
happens with the bSeq.
Can you please confirm this so I can come reply with a full report on this.
Kind regards, and sorry again for the confusion.
Miguel
On Mon, Feb 27, 2017 at 7:30 PM, Miguel Vazquez
wrote:
> Dear friends,
>
> I've performed the first test with the DKFZ BiasFilter and got a perfect
> match. There are 55 variants annotated with OXOGFAIL and they are the same
> in the input VCF file (consensus SNV/MNV VCF for that donor) and the output
> of the BiasFilter. I'm running the test on a second donor.
>
> Best regards
>
> Miguel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From christina.yung at oicr.on.ca Mon Mar 13 11:48:39 2017
From: christina.yung at oicr.on.ca (Christina Yung)
Date: Mon, 13 Mar 2017 10:48:39 -0500
Subject: [DOCKTESTERS] Help needed with DKFZ BiasFilter. Validation of
DO52140. 100% match is wrong!
In-Reply-To:
References:
Message-ID:
An HTML attachment was scrubbed...
URL:
From George.Mihaiescu at oicr.on.ca Mon Mar 13 12:57:14 2017
From: George.Mihaiescu at oicr.on.ca (George Mihaiescu)
Date: Mon, 13 Mar 2017 16:57:14 +0000
Subject: [DOCKTESTERS] Thanks!
In-Reply-To:
Message-ID:
Junjun told me this would provide value to the testing process, so I would like to kick off a test of the BWA_mem docker.
Can somebody provide some quick instructions and the location of the unaligned BAM files that were used already?
Also, do we have somewhere the steps involved in each workflow, so I can get an idea of how far they are while running?
For example, s58_cgpPindel_pin2vcf_95 is three steps from finish, or 50 steps from finish?
Thank you,
George
From: Miguel Vazquez >
Date: Monday, March 13, 2017 at 8:52 AM
To: George Mihaiescu >
Cc: Junjun Zhang >, Jonas Demeulemeester >, "docktesters at lists.icgc.org" >
Subject: Re: [DOCKTESTERS] Thanks!
Hi George,
Answers inline
On Mon, Mar 13, 2017 at 2:43 PM, George Mihaiescu > wrote:
Hi Miguel,
I've started the test by running "bin/run_test.sh Sanger DO50398", so I guess with just one workflow running it should complete faster than two weeks.
I think it still should take a long time. My scripts will run one workflow after another.
Because I'm running in Collaboratory I've changed the "get_icgc_donor.sh" script to use a docker container that has the icgc client inside and pull data from Collaboratory. There is no "bam.bas" file downloaded, just a ".bam" and a ".bam.bai" files, not sure if this is an issue.
I wondered the same thing first time I did this, but this file is produced by the pipeline. There was some problem with this that was dealt with by the developers and updated in the docker. So I think you won't have a problem
By looking at the "bin/compare_result_type.sh" it looks like it's using the gnos client to pull down the existing VCF files for comparison reasons, but I think we store those files in Collaboratory as well, so I'll work with Junjun to adapt the script for this.
Let me know if you need any help
I think I initially tried to run the DKFZ workflow, but it complained about having to run Delly first, so I abandoned this for now.
Yes, if you look at the run_batch.sh you will see that when using DKFZ it will always run Delly first. Delly prepares some files the the DKFZ file needs, namely related to copy number I believe.
I'll set up a new VM and run the "run_batch.sh" on the DO52140 donor.
Remember that you will need to add the relevant has-keys for the different files in the etc/donor_files.csv. Its a bit tedious right now. You need to go to the ICGC DCC and find these codes manually for the files you need. Ask me if you need help. Once you have all you can run all the workflows for that donor and evaluate results.
https://github.com/mikisvaz/PCAWG-Docker-Test/blob/master/etc/donor_files.csv
Regards
Miguel
George
From: Miguel Vazquez >
Date: Monday, March 13, 2017 at 6:53 AM
To: George Mihaiescu >
Cc: Junjun Zhang >, Jonas Demeulemeester >, "docktesters at lists.icgc.org" >
Subject: Re: [DOCKTESTERS] Thanks!
Hi George,
The Sanger workflow is very lengthy, it takes about two weeks in my tests.
About correctness, my scripts also cover that part, if you are not using them they might still help you to clarify how we do it. The idea is to take each of the output files produced: SNV_MNV, Indel, SV, and CNV, for both germline and somatic and compare it with the result uploaded to GNOS (not all pipelines produce all files). This is the relevant part in the run_batch.sh script:
https://github.com/mikisvaz/PCAWG-Docker-Test/blob/master/bin/run_batch.sh#L42-L46
The bin/compare_result_type.sh script will take care of downloading the correct file from GNOS and running the comparison. The comparison itself is simple since all files are VCFs, it consists in taking out the variants in terms of chromosome, position, reference and alternative allele and measuring the overlaps.
https://github.com/mikisvaz/PCAWG-Docker-Test/blob/master/bin/compare_result_type.sh
About which donors to test, DO52140 is one Jonas and I have both tested and could be interesting to get a third opinion. Also, any other donor could be interesting to see if something new comes up. I'm not sure which options is best.
Miguel
On Mon, Mar 13, 2017 at 5:12 AM, George Mihaiescu > wrote:
Hi,
I've started Sanger on DO50398 and it's been running for more than 24 hours, currently at "Workflow step succeeded: s58_bbAllele_merge_59"
I just started a second run on a different VM on same donor, just to compare run times.
The VM used has 8 cores, 48 GB of RAM and 1.1 TB disk and I'll send some monitoring graphs when it finishes the workflow, but I have no idea how to check its correctness.
Give me a list of donors and what workflows you want me to run and I'll try to schedule them tomorrow.
George
From: Junjun Zhang >
Date: Sunday, March 12, 2017 at 10:45 PM
To: Jonas Demeulemeester >, George Mihaiescu >
Cc: Miguel Vazquez >, Denis Yuen >, "docktesters at lists.icgc.org" >
Subject: Re: [DOCKTESTERS] Thanks!
Thanks Miguel and Jonas for your help here!
Do you have any update on the latest testing? Please feel free updating the wiki with any update: https://wiki.oicr.on.ca/display/PANCANCER/2017-03-13+PCAWG-TECH+Teleconference
Regards,
Junjun
From: Jonas Demeulemeester >
Date: Saturday, March 11, 2017 at 7:15 PM
To: George Mihaiescu >
Cc: Miguel Vazquez >, Junjun Zhang >, Denis Yuen >, "docktesters at lists.icgc.org" >
Subject: Re: [DOCKTESTERS] Thanks!
Hi George,
Yup, I've been running the PCAWG dockers mainly using Miguel's set of scripts.
Give them a go and if you run into issues, just let us know!
Cheers,
Jonas
On 11 Mar 2017, at 17:00, George Mihaiescu > wrote:
Sure, I'll give it a try and report later.
Thank you,
George Mihaiescu
Senior Cloud Architect
Ontario Institute for Cancer Research
MaRS Centre
661 University Avenue
Suite 510
Toronto, Ontario
Canada M5G 0A3
Email: George.Mihaiescu at oicr.on.ca
Toll-free: 1-866-678-6427
Twitter: @OICR_news
www.oicr.on.ca
This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization.
From: Miguel Vazquez >
Date: Saturday, March 11, 2017 at 10:57 AM
To: Junjun Zhang >
Cc: Denis Yuen >, Jonas Demeulemeester >, George Mihaiescu >, "docktesters at lists.icgc.org" >
Subject: Re: [DOCKTESTERS] Thanks!
Hi Junjun,
I think Jonas has been using my scripts to run some of the tests, maybe George could try them as well, it should be very easy for him to try the Sanger, Delly+DKFZ, BWA-Mem, and the BiasFilter.
https://github.com/mikisvaz/PCAWG-Docker-Test
He would just need to update the tokens for DACO access and the scripts will take care of downloading the BAM files, running the workflows and evaluating the result.
The documentation there is reasonably updated, but if this sounds good then perhaps he could contact me and I could walk him through the details.
Best regards
Miguel
On Fri, Mar 10, 2017 at 9:51 PM, Junjun Zhang > wrote:
Dear Docktesters,
George Mihaiescu, cloud architect, of the Collaboratory at OICR plans to run some bioinformatics workflows to test Collab environment.
Just thought this is a good opportunity to use as extra help for testing out the PCAWG dockerized workflows.
Miguel, Denis and others, what workflows / datasets do you think would be good for George to run?
Thanks,
Junjun
From: > on behalf of Denis Yuen >
Date: Wednesday, March 1, 2017 at 10:26 AM
To: "docktesters at lists.icgc.org" >
Subject: [DOCKTESTERS] Thanks!
Hi,
Just wanted to say thanks to Miguel and Jonas for keeping the workflow testing data page up-to-date.
https://wiki.oicr.on.ca/display/PANCANCER/Workflow+Testing+Data
As we work on new versions or debugging, it is invaluable to know what versions of the workflows have worked outside OICR, thanks!
Denis Yuen
Senior Software Developer
OntarioInstituteforCancerResearch
MaRSCentre
661 University Avenue
Suite510
Toronto, Ontario,Canada M5G0A3
Toll-free: 1-866-678-6427
Twitter: @OICR_news
www.oicr.on.ca
This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization.
_______________________________________________
docktesters mailing list
docktesters at lists.icgc.org
https://lists.icgc.org/mailman/listinfo/docktesters
The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 1 Midland Road London NW1 1AT
_______________________________________________
docktesters mailing list
docktesters at lists.icgc.org