TY - JOUR
T1 - Facilitated sequence assembly using densely labeled optical DNA barcodes
T2 - A combinatorial auction approach
AU - Dvirnas, Albertas
AU - Pichler, Christoffer
AU - Stewart, Callum L.
AU - Quaderi, Saair
AU - Nyberg, Lena K.
AU - Müller, Vilhelm
AU - Bikkarolla, Santosh Kumar
AU - Kristiansson, Erik
AU - Sandegren, Linus
AU - Westerlund, Fredrik
AU - Ambjörnsson, Tobias
N1 - Funding Information:
EuroNanoMedIIGrant:“Nanofluidicsforultrafast diagnosisofbacterialinfections–NanoDiaBac.”EK wassupportedbyTheKnutandAliceWallenberg Foundation.LSwassupportedbySwedish ResearchCouncil-MedicineandHealth,grant K2013-99X-22208-01-5.FW,TA,EKandLSwere supportedbyTheErling-PerssonFamily Foundation.
Publisher Copyright:
© 2018 Dvirnas et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Copyright:
Copyright 2018 Elsevier B.V., All rights reserved.
PY - 2018/3/9
Y1 - 2018/3/9
N2 - The output from whole genome sequencing is a set of contigs, i.e. short non-overlapping DNA sequences (sizes 1-100 kilobasepairs). Piecing the contigs together is an especially difficult task for previously unsequenced DNA, and may not be feasible due to factors such as the lack of sufficient coverage or larger repetitive regions which generate gaps in the final sequence. Here we propose a new method for scaffolding such contigs. The proposed method uses densely labeled optical DNA barcodes from competitive binding experiments as scaffolds. On these scaffolds we position theoretical barcodes which are calculated from the contig sequences. This allows us to construct longer DNA sequences from the contig sequences. This proof-of-principle study extends previous studies which use sparsely labeled DNA barcodes for scaffolding purposes. Our method applies a probabilistic approach that allows us to discard “foreign” contigs from mixed samples with contigs from different types of DNA. We satisfy the contig non-overlap constraint by formulating the contig placement challenge as a combinatorial auction problem. Our exact algorithm for solving this problem reduces computational costs compared to previous methods in the combinatorial auction field. We demonstrate the usefulness of the proposed scaffolding method both for synthetic contigs and for contigs obtained using Illumina sequencing for a mixed sample with plasmid and chromosomal DNA.
AB - The output from whole genome sequencing is a set of contigs, i.e. short non-overlapping DNA sequences (sizes 1-100 kilobasepairs). Piecing the contigs together is an especially difficult task for previously unsequenced DNA, and may not be feasible due to factors such as the lack of sufficient coverage or larger repetitive regions which generate gaps in the final sequence. Here we propose a new method for scaffolding such contigs. The proposed method uses densely labeled optical DNA barcodes from competitive binding experiments as scaffolds. On these scaffolds we position theoretical barcodes which are calculated from the contig sequences. This allows us to construct longer DNA sequences from the contig sequences. This proof-of-principle study extends previous studies which use sparsely labeled DNA barcodes for scaffolding purposes. Our method applies a probabilistic approach that allows us to discard “foreign” contigs from mixed samples with contigs from different types of DNA. We satisfy the contig non-overlap constraint by formulating the contig placement challenge as a combinatorial auction problem. Our exact algorithm for solving this problem reduces computational costs compared to previous methods in the combinatorial auction field. We demonstrate the usefulness of the proposed scaffolding method both for synthetic contigs and for contigs obtained using Illumina sequencing for a mixed sample with plasmid and chromosomal DNA.
UR - http://www.scopus.com/inward/record.url?scp=85043765928&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0193900
DO - 10.1371/journal.pone.0193900
M3 - Article
C2 - 29522539
AN - SCOPUS:85043765928
VL - 13
JO - PLoS ONE
JF - PLoS ONE
SN - 1932-6203
IS - 3
M1 - e0193900
ER -