[DOCKTESTERS] Summary of validation of Dockers Sanger and DKFZ(+Delly)

Junjun Zhang Junjun.Zhang at oicr.on.ca
Mon Jan 23 08:26:24 EST 2017


Hi Miguel,

I will need to take a closer look on this. But what I know are important include (but not limited to) that input bams must be lane level, ie, bam with only one read group id; some part of the read group line in bam header must be properly formed, don't remember details here. But many of us should remember we had done some bam reheading to some larger number of bams because of header issue.

Cheers
Junjun

On Jan 23, 2017, at 8:10 AM, Miguel Vazquez <miguel.vazquez at cnio.es<mailto:miguel.vazquez at cnio.es>> wrote:

Junjun,

The section to refer about preparing the unaligned version of the BAM seems the same as when I first went through it. The command-line featured there reads

cat initial.bam | bamreset exclude=QCFAIL,SECONDARY,SUPPLEMENTARY resetheadertext=header.sam md5=1 md5filename=cleaned.bam.md5 > cleaned.bam

Since some of the options refer to setting the header and performing the md5 etc, I've stripped it down to:

cat initial.bam | bamreset exclude=QCFAIL,SECONDARY,SUPPLEMENTARY > cleaned.bam

Do you think there is any reason that this might not work? I don't see how the header info would have any effect on the process. I would be good to rule this out as a source of the problem.

Best

Miguel



On Mon, Jan 16, 2017 at 4:24 PM, Junjun Zhang <Junjun.Zhang at oicr.on.ca<mailto:Junjun.Zhang at oicr.on.ca>> wrote:
Hi,

I'd like to point out that this wiki page is obsolete: https://wiki.oicr.on.ca/display/PANCANCER/Preparing+paired-end+data+for+upload.

It has been replaced by the FINAL SOP for preparing PCAWG sequences here: https://wiki.oicr.on.ca/display/PANCANCER/PCAWG+%28a.k.a.+PCAP+or+PAWG%29+Sequence+Submission+SOP+-+v1.0. Many of us worked on it including Keiran.

Regarding preparation of test input BAM from merged / aligned BAM, the first step (Your starting files must be for a single lane of sequencing in either BAM (unaligned BAM) or FASTQ) of this section is most relevant: https://wiki.oicr.on.ca/display/PANCANCER/PCAWG+%28a.k.a.+PCAP+or+PAWG%29+Sequence+Submission+SOP+-+v1.0#PCAWG(a.k.a.PCAPorPAWG)SequenceSubmissionSOP-v1.0-5.Prepareyoursequencefiles<https://wiki.oicr.on.ca/display/PANCANCER/PCAWG+%28a.k.a.+PCAP+or+PAWG%29+Sequence+Submission+SOP+-+v1.0#PCAWG%28a.k.a.PCAPorPAWG%29SequenceSubmissionSOP-v1.0-5.Prepareyoursequencefiles>

Hope this helps,

Junjun




Junjun Zhang


Lead bioinformatician


Informatics and Bio-computing

Ontario Institute for Cancer Research

MaRS Centre

661 University Avenue

Suite 510

Toronto, Ontario, Canada M5G 0A3


Tel: 416-673-8517

Mobile: 647-501-7511

Toll-free: 1-866-678-6427

Twitter:  @OICR_news


www.oicr.on.ca<http://www.oicr.on.ca/>


This message and any attachments may contain confidential and/or privileged information for the sole use of the intended recipient. Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited. If you have received this message in error, please contact the sender and delete all copies. Opinions, conclusions or other information contained in this message may not be that of the organization.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.icgc.org/mailman/private/docktesters/attachments/20170123/2135abfb/attachment.html>


More information about the docktesters mailing list