[DOCKTESTERS] Broad dockers

Gordon Saksena gsaksena at broadinstitute.org
Thu Oct 13 11:57:37 EDT 2016


1) I tested under Cromwell 0.19.3.  I will likely move over to 0.21 soon.
I used wdltool 0.4.

2) At the moment I have almost 20 dockers listed, including some that are
filters and some that have known buggy algorithms.  More discussion is
needed, probably both internally and externally, to develop a more stable
list and map out timelines.  Let me discuss more internal to Broad first
before sharing the list.

The intent is to allow users to both A) reproduce the PCAWG work, and B)
let them apply the algorithms to their own new project.

A) (reproducibility) could be mostly served by one big docker, as was run
in production, but there are currently issues around a) distributing the
aggregated panel of normals for MuTect and dRanger, b) some GPL code needs
to be unlinked from the mini-bam creator, and c) the MuTect-ContEst rescue
code needs to be patched in.   The one big docker is not ideal for B)
(re-application), due to the additional issues: a) wanting bug fixes, b)
you need to wait for algorithms you don't care about, c) the SV caller
algorithms demand a ton of RAM and CPU time on certain samples, making them
bad neighbors on cheap VMs, d) some algorithms (eg aggregators, filters)
have not been dockerized yet and would delay the release of others, and e)
even portions licensed for free commercial use would be bundled under the
more restrictive GATK license.

3) The GATK dual-license model is documented here:
https://software.broadinstitute.org/gatk/download/licensing.php
And, something like it will probably apply to certain portions that do not
use GATK.  We have no plans to use DRM.

Gordon

On Thu, Oct 13, 2016 at 10:46 AM, Denis Yuen <Denis.Yuen at oicr.on.ca> wrote:

> Hi,
> Thanks for the heads-up, we'll look into these containers and see what we
> can do in terms of testing them and getting them posted to the Dockstore.
>
> I do have a few questions:
> 1) What versions of Cromwell and wdl4s was this portion of the pipeline
> tested with?
> 2) This release is for the tokens portion of the pipeline, how many
> portions of the pipeline do you anticipate will be available in the end?
> (and do you have a handy graph/chart that we can use to describe this?)
> 3) For post-publication, are you thinking about a dual-license model (
> something like https://www.quora.com/What-is-the-best-license-to-apply-on-
> an-open-source-project-that-I-intend-to-sell-commercially-as-well ) or
> are you thinking about some form of DRM in the Docker image?
>
>
> ------------------------------
> *From:* docktesters-bounces+denis.yuen=oicr.on.ca at lists.icgc.org
> [docktesters-bounces+denis.yuen=oicr.on.ca at lists.icgc.org] on behalf of
> Gordon Saksena [gsaksena at broadinstitute.org]
> *Sent:* October 12, 2016 3:11 PM
> *To:* docktesters
> *Cc:* Gad Getz
> *Subject:* [DOCKTESTERS] Broad dockers
>
> Hi,
>
> I have approval for the Broad PCAWG dockers to go on the staging
> (non-password protected) portion of Dockstore.  The Github Repo and
> DockerHub image have permissions granted to folks who sent me their github
> and docker usernames.  This should be adequate for the initial rounds of
> testing.
>
> The tokens portion of the pipeline should be ready for folks to give it a
> spin.  The non-protected reference files are passed in via http in the
> inputs file, and it only needs the normal BAM.  It takes about 5 hours on a
> full size file using a couple cores.  Other portions will be added as they
> are split out and tested.
>
> While the current permissions setup is fine for testing, something
> different will be needed for post publication.  Many of the algorithms are
> (or will be) free only for educational or research use, but require a
> separate license for commercial use.  We should discuss how that use case
> can be supported, and whether it has implications on testing.
>
>
> https://hub.docker.com/r/broadinstitute/pcawg_tokens/
> <http://redir.aspx?REF=t2oyfEuDAgvNSbbfIv6vDmA3IdQFqBlgBMQyYyaIaE76V1PAdfPTCAFodHRwczovL2h1Yi5kb2NrZXIuY29tL3IvYnJvYWRpbnN0aXR1dGUvcGNhd2dfdG9rZW5zLw..>
>
> https://github.com/broadinstitute/pcawg
> <http://redir.aspx?REF=YRuHl29pr_J2zGxyN2fVpymfwWxOPaImHYhnj94xV5P6V1PAdfPTCAFodHRwczovL2dpdGh1Yi5jb20vYnJvYWRpbnN0aXR1dGUvcGNhd2c.>
> https://github.com/broadinstitute/pcawg/blob/master/tasks/tokens/taskdef.
> tokens.wdl
> <http://redir.aspx?REF=FGY08LTLLTocjDsGPbicTMRuN2l47cI66PmvalxNDgf6V1PAdfPTCAFodHRwczovL2dpdGh1Yi5jb20vYnJvYWRpbnN0aXR1dGUvcGNhd2cvYmxvYi9tYXN0ZXIvdGFza3MvdG9rZW5zL3Rhc2tkZWYudG9rZW5zLndkbA..>
> https://github.com/broadinstitute/pcawg/blob/
> master/tasks/tokens/inputtest_http_refdata.tokens.json
> <http://redir.aspx?REF=TIcioMOk8LkXN0dYIggrha-l4JfAM0989TTSW2YBUkL6V1PAdfPTCAFodHRwczovL2dpdGh1Yi5jb20vYnJvYWRpbnN0aXR1dGUvcGNhd2cvYmxvYi9tYXN0ZXIvdGFza3MvdG9rZW5zL2lucHV0dGVzdF9odHRwX3JlZmRhdGEudG9rZW5zLmpzb24.>
>
> Gordon
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.icgc.org/mailman/private/docktesters/attachments/20161013/9bb9b902/attachment-0001.html>


More information about the docktesters mailing list