University of Virginia Department of
    Computer Science

Friday, March 21, 2008
Shahid H. Bokhari
Visiting Scholar
OSU
Host: John Knight
OLSSON 009, 2:30 PM

A parallel graph decomposition algorithm for DNA sequencing with nanopores

ABSTRACT

With the potential availability of nanopore devices that can sense the bases of translocating single-stranded DNA (ssDNA), it is likely that “reads” of length ~105 will be available in large numbers and at high speed. We address the problem of complete DNA sequencing using such reads.

We assume that ~102 copies of a DNA sequence are split into single strands that break into randomly sized pieces as they translocate the nanopore in arbitrary orientations. The nanopore senses and reports each individual base that passes through, but all information about orientation and complementarity of the ssDNA subsequences is lost. Random errors in the reads create further complications.

We have developed an algorithm that addresses these issues. It can be considered an extreme variation of the well-known Eulerian path approach. It searches over a space of de Bruijn graphs until it finds one in which (a) the impact of errors is eliminated and (b) both possible orientations of the two ssDNA sequences can be identified separately and unambiguously.

We describe a parallel implementation of this algorithm on the Cray Multithreaded Architecture (MTA-2) supercomputer, whose architecture is ideally suited to this “unstructured” problem.

Joint work with Jon R. Sauer, Eagle R&D, Boulder.

Biography:

Dr. Bokhari is a Fellow the IEEE and of the ACM. He is also an ISI Highly Cited Researcher in Computer Science. He was with the Department of Electrical Engineering, University of Engineering & Technology, Lahore, Pakistan, from 1980 to 2006. He has been associated with the Institute for Computer Applications in Science & Engineering (ICASE) at NASA Langley Center in Hampton, Virginia, where he spent a total of nearly seven years as visiting scientist or consultant over the period 1978-1998. He is currently an independent researcher and a Visiting Scholar at the Dept. of Biomedical Informatics at Ohio State. Other institutions that he has been associated with as a researcher include the Universities of Colorado, Stuttgart, and Vienna, and the Electrotechnical Laboratory in Tsukuba.

Dr. Bokhari received the BSc degree in Electrical Engineering from the University of Engineering, Lahore, Pakistan, in 1974 and the MS and PhD degrees in Electrical and Computer Engineering from the University of Massachusetts, Amherst, in 1976 and 1978. His research interests include Parallel and Distributed Computing. He is currently applying his expertise in these areas to Computational Biology and Bioinformatics. He is particularly interested in parallel algorithms on massively multithreaded machines, such as the Cray MTA/XMT.

Reception – 4:00 p.m. Olsson Hall room 228E



Other Recent and Upcoming Colloquia