ASCI Database
Articles by A Bashir
Total Records ( 3 ) for A Bashir
  S Sindi , E Helman , A Bashir and B. J. Raphael

Motivation: Structural variants, including duplications, insertions, deletions and inversions of large blocks of DNA sequence, are an important contributor to human genome variation. Measuring structural variants in a genome sequence is typically more challenging than measuring single nucleotide changes. Current approaches for structural variant identification, including paired-end DNA sequencing/mapping and array comparative genomic hybridization (aCGH), do not identify the boundaries of variants precisely. Consequently, most reported human structural variants are poorly defined and not readily compared across different studies and measurement techniques.

Results: We introduce Geometric Analysis of Structural Variants (GASV), a geometric approach for identification, classification and comparison of structural variants. This approach represents the uncertainty in measurement of a structural variant as a polygon in the plane, and identifies measurements supporting the same variant by computing intersections of polygons. We derive a computational geometry algorithm to efficiently identify all such intersections. We apply GASV to sequencing data from nine individual human genomes and several cancer genomes. We obtain better localization of the boundaries of structural variants, distinguish genetic from putative somatic structural variants in cancer genomes, and integrate aCGH and paired-end sequencing measurements of structural variants. This work presents the first general framework for comparing structural variants across multiple samples and measurement techniques, and will be useful for studies of both genetic structural variants and somatic rearrangements in cancer.



  K. J McKernan , H. E Peckham , G. L Costa , S. F McLaughlin , Y Fu , E. F Tsung , C. R Clouser , C Duncan , J. K Ichikawa , C. C Lee , Z Zhang , S. S Ranade , E. T Dimalanta , F. C Hyland , T. D Sokolsky , L Zhang , A Sheridan , H Fu , C. L Hendrickson , B Li , L Kotler , J. R Stuart , J. A Malek , J. M Manning , A. A Antipova , D. S Perez , M. P Moore , K. C Hayashibara , M. R Lyons , R. E Beaudoin , B. E Coleman , M. W Laptewicz , A. E Sannicandro , M. D Rhodes , R. K Gottimukkala , S Yang , V Bafna , A Bashir , A MacBride , C Alkan , J. M Kidd , E. E Eichler , M. G Reese , F. M De La Vega and A. P. Blanchard

We describe the genome sequencing of an anonymous individual of African origin using a novel ligation-based sequencing assay that enables a unique form of error correction that improves the raw accuracy of the aligned reads to >99.9%, allowing us to accurately call SNPs with as few as two reads per allele. We collected several billion mate-paired reads yielding ~18x haploid coverage of aligned sequence and close to 300x clone coverage. Over 98% of the reference genome is covered with at least one uniquely placed read, and 99.65% is spanned by at least one uniquely placed mate-paired clone. We identify over 3.8 million SNPs, 19% of which are novel. Mate-paired data are used to physically resolve haplotype phases of nearly two-thirds of the genotypes obtained and produce phased segments of up to 215 kb. We detect 226,529 intra-read indels, 5590 indels between mate-paired reads, 91 inversions, and four gene fusions. We use a novel approach for detecting indels between mate-paired reads that are smaller than the standard deviation of the insert size of the library and discover deletions in common with those detected with our intra-read approach. Dozens of mutations previously described in OMIM and hundreds of nonsynonymous single-nucleotide and structural variants in genes previously implicated in disease are identified in this individual. There is more genetic variation in the human genome still to be uncovered, and we provide guidance for future surveys in populations and cancer biopsies.

