kmers.cpp (1.16 kB)
Download fileFast kmer counting table algorithm using perfect hash function: C++ pseudo-code integration into R using Rcpp API
Counting kmers (substrings of length k in DNA
sequence data) is an essential component of many methods in bioinformatics,
including data preprocessing for de novo assembly, repeat detection, and
sequencing coverage estimation.
We proposed a simple algorithm to calculate the kmer count using perfect hash
table implemented in C++ and using Rcpp API to be able exported into R.