Nanopore sequencing can read much longer sequences of biological molecules than other sequencing methods, which has led to advances in genomic analysis such as the gapless human genome assembly. By analyzing the raw electrical signal output of nanopores, existing works can map reads without translating them into DNA characters, allowing for quick and efficient analysis of sequencing data. However, raw signals often contain errors due to noise and mistakes when processing them, which limits the overall accuracy of raw signal analysis. Our goal in this work is to detect and correct errors in raw signals to improve the accuracy of raw signal analyses. To this end, we propose Nemo-HMM, an HMM-based model that identifies nanopore signal dynamics and accurately corrects errors and slightly reduces noise. Our evaluation on the E. coli datasets using the state-of-the-art raw signal analysis tool, RawHash2, shows that Nemo-HMM can consistently improve the overall mapping accuracy and increase the total number of mapped reads that could not be mapped previously.
Simon is an undergraduate Computer Science major in his final year at UMD. He's been involved with on-campus research for 3 years, and he is interested in improving genome and transcriptome assembly methods.

