[DOCKTESTERS] BWA-Mem update
Miguel Vazquez
mikisvaz at gmail.com
Mon Apr 10 12:34:43 EDT 2017
Hi George,
You can find my scripts as usual at https://github.com/mikisvaz/
PCAWG-Docker-Test. It now includes the json files that Junjun sent us for
three donors, and a script to download the unaligned BAMs, run the test,
and compare the result.
Try:
1- update the repo
2-> bin/download_unaligned.sh DO51057
3-> bin/run_bwa_test.sh DO51057
4-> bin/compare_bwa_bam.sh
tests/BWA-Mem/DO51057/normal/output/DO51057.\[TYPE\].merged_output.bam
data/DO51057/normal.bam
5-> bin/compare_bwa_bam.sh
tests/BWA-Mem/DO51057/tumor/output/DO51057.\[TYPE\].merged_output.bam
data/DO51057/tumor.bam
Jonas has a slightly different set of scripts that might also be a bit more
correct, so perhaps you can wait for his input.
Best
Miguel
On Mon, Apr 10, 2017 at 6:11 PM, George Mihaiescu <
George.Mihaiescu at oicr.on.ca> wrote:
> Hi,
>
> I would like to run the BWA-mem dockerized workflow in the Collaboratory
> environment, but I need some help in order to do this:
>
> - A ready-to-run script or instructions
> - The input files: single file or multiple files, whatever the script
> needs as an input
> - The donor ID, preferably the same donor that was already used in
> order to prove the reproducibility of the results
>
> I can start the workflow on a large VM in order to speed up the result.
>
> Also, I'm currently running the DKFZ workflow on DO50398 because I've
> already ran Sanger on it, and I want to compare the run times for the two
> workflows on the same data set.
>
> Thank you,
> George
>
>
> From: Miguel Vazquez <mikisvaz at gmail.com>
> Date: Wednesday, March 22, 2017 at 2:08 PM
> To: Jonas Demeulemeester <Jonas.Demeulemeester at crick.ac.uk>
> Cc: Keiran Raine <kr2 at sanger.ac.uk>, Junjun Zhang <Junjun.Zhang at oicr.on.ca>,
> George Mihaiescu <George.Mihaiescu at oicr.on.ca>, "
> docktesters at lists.icgc.org" <docktesters at lists.icgc.org>
> Subject: Re: [DOCKTESTERS] BWA-Mem update
>
> Thanks Jonas for this information.
>
> I hope that someone here can provide us with some suggestion on what to
> try next. Perhaps the version issue that Jonas point out is the key.
>
> I just want to add that, as I told Jonas earlier, my own tests using the
> new split BAM files also gave 3% mismatches.
>
> Best regards
>
> Miguel
>
> On Wed, Mar 22, 2017 at 6:56 PM, Jonas Demeulemeester <
> Jonas.Demeulemeester at crick.ac.uk> wrote:
>
>> Hi all,
>>
>> A brief update on the BWA-Mem docker tests.
>> I prepared normal + tumor lane-level unaligned bams for DO503011 and ran
>> the BWA-Mem workflow for normal and tumor seperately.
>> Doing the comparison however, I am still getting 3% of reads that are
>> aligned differently (see below for a few examples).
>> However, when checking the headers of the original and newly mapped bam
>> files (attached) I noticed that the original is mapped using a different
>> version of BWA and SeqWare.
>> I’m hoping the mapping differences can be ascribed to this.
>>
>> Is there a list available somewhere detailing which samples were mapped
>> using which versions?
>> That way we could select a relevant test sample without having to sort
>> through the headers of all different bams.
>>
>> Best wishes,
>> Jonas
>>
>>
>>
>>
>>
>> newly aligned:
>>
>> IDflagchrpos
>> HS2000-1012_275:7:1101:17411:15403993112743126
>> HS2000-1012_275:7:1101:17411:154031473112743376
>> HS2000-1012_275:7:1101:11883:83640991628672999
>> HS2000-1012_275:7:1101:11883:836401471628673223
>> HS2000-1012_275:7:1101:16576:28476163GL000238.121309
>> HS2000-1012_275:7:1101:16576:2847683GL000238.121664
>>
>> vs the original:
>>
>> IDflagchrpos
>> HS2000-1012_275:7:1101:17411:1540399854944243
>> HS2000-1012_275:7:1101:17411:15403147854944493
>> HS2000-1012_275:7:1101:11883:836401631628464362
>> HS2000-1012_275:7:1101:11883:83640831628464586
>> HS2000-1012_275:7:1101:16576:2847699126124549
>> HS2000-1012_275:7:1101:16576:28476147126124903
>>
>>
>> _________________________________
>> Jonas Demeulemeester, PhD
>> Postdoctoral Researcher
>> The Francis Crick Institute
>> 1 Midland Road
>> London
>> NW1 1AT
>>
>> *T:* +44 (0)20 3796 2594 <+44%2020%203796%202594>
>> M: +44 (0)7482 070730 <+44%207482%20070730>
>> *E:* jonas.demeulemeester at crick.ac.uk
>> *W:* www.crick.ac.uk
>>
>> The Francis Crick Institute Limited is a registered charity in England
>> and Wales no. 1140062 and a company registered in England and Wales no.
>> 06885462, with its registered office at 1 Midland Road London NW1 1AT
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.icgc.org/mailman/private/docktesters/attachments/20170410/a387e854/attachment.html>
More information about the docktesters
mailing list