Plasmodium 3D7 Genome Assembly With PacBio Data and CLEAR
This fileset contains data, code and IPython Notebooks demostrating
how to assemble the hard to assemble 80% AT-rich Plasmodium 3D7 genome.
The first thing you might want to do is probably to read through
''P3D7_CLEAR_NOTE.pdf'' to know what this dataset is about.
You will have to "gunzip" the two main data file ''preads.fa.gz'' and
''pr_pr_strigent.m4.gz'' to use them.
The IPython Notebook ''P3D7_CLEAR_NOTE.ipynb'' and ''P3D7_CLEAR_NOTE_P2.ipynb''
can be run to generate the assembly if you have proper computation environment.
There are also the PDF verison of the notebooks, ''P3D7_CLEAR_NOTE.pdf''
and ''P3D7_CLEAR_NOTE_P2.pdf''. Some requirement on how to reproduce the
computational experiment can be found in ''P3D7_CLEAR_NOTE.pdf''.
Thanks Carsten Russ (Broad Institute) and Sarah Volkmann (Harvard School of Public Health) for providing the DNA sample.