Strong nucleosomes of mouse genome including recovered centromeric sequences

<div><p>Recently discovered strong nucleosomes (SNs) characterized by visibly periodical DNA sequences have been found to concentrate in centromeres of <i>Arabidopsis thaliana</i> and in transient meiotic centromeres of <i>Caenorhabditis elegans</i>. To find out whether such affiliation of SNs to centromeres is a more general phenomenon, we studied SNs of the <i>Mus musculus.</i> The publicly available genome sequences of mouse, as well as of practically all other eukaryotes do not include the centromere regions which are difficult to assemble because of a large amount of repeat sequences in the centromeres and pericentromeric regions. We recovered those missing sequences using the data from MNase-seq experiments in mouse embryonic stem cells, where the sequence of DNA inside nucleosomes, including missing regions, was determined by 100-bp paired-end sequencing. Those nucleosome sequences, which are not matching to the published genome sequence, would largely belong to the centromeres. By evaluating SN densities in centromeres and in non-centromeric regions, we conclude that mouse SNs concentrate in the centromeres of telocentric mouse chromosomes, with ~3.9 times excess compared to their density in the rest of the genome. The remaining non-centromeric SNs are harbored mainly by introns and intergenic regions, by retro-transposons, in particular. The centromeric involvement of the SNs opens new horizons for the chromosome and centromere structure studies.</p></div>