Skip to main navigation Skip to search Skip to main content

A short reads alignment algorithm oriented to massive data

  • Gao Yang Li
  • , Kai Wang
  • , Yu kun Zeng
  • , Guang ri Quan*
  • *Corresponding author for this work
  • School of Computer Science and Technology (School of Software), Harbin Institute of Technology Weihai

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

DNA sequencing technology has seen rapid development in recent years, and both the sequencing throughput and read lengths are growing. Besides, new properties such as paired-end sequencing are emerging. Therefore, it is of great value to develop a sequence alignment algorithm for this new type of DNA data. In this paper, an alignment algorithm is proposed. Instead of the Smith-Waterman algorithm, a local alignment algorithm oriented to sparse mutation is used to accelerate seed extension. Besides, instead of aligning short reads one by one, this software puts all reads with similar seeds together to accelerate seed location. This paper uses human genome reference sequences and short sequencing data from GenBank (40 times coverage) to evaluate our algorithm. And we compare our work with Bowtie2 in terms of speed and accuracy. The results show our algorithm has significant advantages in alignment speed and space overhead with large scale data.

Original languageEnglish
Title of host publicationCurrent Trends in Computer Science and Mechanical Automation Vol.1
Subtitle of host publicationSelected Papers from CSMA2016
Publisherde Gruyter
Pages49-57
Number of pages9
ISBN (Electronic)9783110584974
ISBN (Print)9783110584967
StatePublished - 9 Jan 2018
Externally publishedYes

Keywords

  • Alignment tool
  • Local alignment algorithm
  • Next-Generation Sequencing

Fingerprint

Dive into the research topics of 'A short reads alignment algorithm oriented to massive data'. Together they form a unique fingerprint.

Cite this