Highly accurate and sensitive short read aligner

Authors: MEHMET YAĞMUR GÖK, SEZER GÖREN UĞURDAĞ, CEM ÜNSALAN, MAHMUT ŞAMİL SAĞIROĞLU

Abstract: Next-generation sequencing generates large numbers of short reads from DNA. This makes it difficult to process and store. Therefore, efficient sequence alignment and mapping techniques are needed in bioinformatics. Alignment and mapping are the basic steps involved in genetic data analysis. The Smith-Waterman (SW) algorithm, a well-known dynamic programming algorithm, is often used for this purpose. In this work, we propose to utilize Phred quality scores in Gotoh's affine gap model to increase the accuracy and sensitivity of the SW algorithm. Hardware platforms such as FPGAs and GPUs are commonly used to solve computationally expensive problems. In this work, a hybrid PC-FPGA system is built where the SW algorithm based on the affine gap model with Phred quality scores is implemented on the FPGA and a read compressor is implemented on the host PC. We compare our method with state-of-the-art systems such as Bowtie, BWA, and the Kim-Olson FPGA-based system in terms of sensitivity, accuracy, and speed. Based on extensive experiments, we observed that our proposed method is more sensitive and accurate as compared to other solutions.

Keywords: Alignment, short read, FPGA, Smith-Waterman, genome, sensitivity, accuracy

Full Text: PDF