[DOCKTESTERS] Help request: get donor file information from ICGC DCC programmatically

Junjun Zhang Junjun.Zhang at oicr.on.ca
Mon Apr 17 09:03:17 EDT 2017


Hi Miguel,

That should work. For the 'export', if you'd like it's possible to add suitable filters so that the export only gives you file of interest, for example, only PCAWG files of certain type.

Cheers,
Junjun


From: <docktesters-bounces+junjun.zhang=oicr.on.ca at lists.icgc.org<mailto:docktesters-bounces+junjun.zhang=oicr.on.ca at lists.icgc.org>> on behalf of Miguel Vazquez <miguel.vazquez at cnio.es<mailto:miguel.vazquez at cnio.es>>
Date: Monday, April 17, 2017 at 6:39 AM
To: Denis Yuen <Denis.Yuen at oicr.on.ca<mailto:Denis.Yuen at oicr.on.ca>>, Jonas Demeulemeester <Jonas.Demeulemeester at crick.ac.uk<mailto:Jonas.Demeulemeester at crick.ac.uk>>
Cc: "docktesters at lists.icgc.org<mailto:docktesters at lists.icgc.org>" <docktesters at lists.icgc.org<mailto:docktesters at lists.icgc.org>>, Dusan Andric <Dusan.Andric at oicr.on.ca<mailto:Dusan.Andric at oicr.on.ca>>
Subject: Re: [DOCKTESTERS] Help request: get donor file information from ICGC DCC programmatically

Hi Jonas, Denis, and Dusan,

Jonas, the release_may2016 file you mention does indeed have some of the GNOS ids we would need but not all unfortunately. For instance the consensus vcf file is not there it seems.

Denis, Dusan. I think I figured out how to do this. I download the full tsv export with

curl -X GET --header 'Accept: text/tsv' 'https://dcc.icgc.org/api/v1/repository/files/export?filters=%7B%7D'

get all the files for a donor and the for each file I download the associated json info with

curl -X GET --header 'Accept: application/json' "https://dcc.icgc.org/api/v1/repository/files/$file"

There I can find the file name along with the ID's I need to download it.

Best

Miguel

On Wed, Apr 12, 2017 at 5:38 PM, Denis Yuen <Denis.Yuen at oicr.on.ca<mailto:Denis.Yuen at oicr.on.ca>> wrote:

Hi,


As I understand it, there is an API for the portal http://docs.icgc.org/portal/api-endpoints/#/

I'm going to also forward this to Dusan who may be able to point you at a more specific endpoint to use or in the correct direction.

________________________________
From: docktesters-bounces+denis.yuen=oicr.on.ca at lists.icgc.org<mailto:oicr.on.ca at lists.icgc.org> <docktesters-bounces+denis.yuen=oicr.on.ca at lists.icgc.org<mailto:oicr.on.ca at lists.icgc.org>> on behalf of Miguel Vazquez <miguel.vazquez at cnio.es<mailto:miguel.vazquez at cnio.es>>
Sent: April 11, 2017 1:01:14 PM
To: docktesters at lists.icgc.org<mailto:docktesters at lists.icgc.org>; Junjun Zhang
Subject: [DOCKTESTERS] Help request: get donor file information from ICGC DCC programmatically

Dear friends,

For our upcoming testing work on the filters I think we will be using the SNV and SV files that where submitted from the different providers and comparing with the resulting final VCFs. To access these I was planning to use GNOS or the ICGC client, for which we will need to specify the "Object ID" or "Submitted Bundle ID" that are found in the file pages of ICGC (e.g https://dcc.icgc.org/repositories/files/FI384359). For the normal and tumor BAMS and for the consensus VCF I was going to the site manually and extracting them, but it is tedious and error prone, and it's the only really manual thing in the scripts. It will be great to automate it.

Is there a programmatic way to find all the files for a donor, with the names, and then access for any of them the "Object ID" and "Submitted Bundle ID"?

Best regards

Miguel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.icgc.org/mailman/private/docktesters/attachments/20170417/c7071b79/attachment.html>


More information about the docktesters mailing list