6 files

Protist Ribosomal Reference database (PR2) - SSU rRNA gene database

posted on 11.09.2017, 12:37 by Daniel VaulotDaniel Vaulot
The Protist Ribosomal Reference database (PR2) provides a unique access to eukaryotic small sub-unit (SSU) ribosomal RNA and DNA sequences, with curated taxonomy. The database mainly consists of nuclear-encoded protistan sequences. However, metazoans, land plants, macrosporic fungi and eukaryotic organelles (mitochondrion, plastid and others) are also included because they are useful for the analysis of high-troughput sequencing data sets. Introns and putative chimeric sequences have been also carefully checked. Taxonomic assignation of sequences consists of eight unique taxonomic fields.

The original web site (http://ssu-rrna.org/pr2) is currently out and we are proposing updated version of PR2 as flat files to use for annotating metabarcodes.

Current version : 4.62 (version 15 on Figshare)
Last update : 11 September 2017


- pr2_version_4.x_mothur.zip contains two files for use with Qiime or Mothur.
* pr2....fasta contains all sequences in fasta format with the accession in the description line
* pr2....tax contains the taxonomy of each sequence separated from the accession number by a tabulation

- pr2_version_4.x_UTAX.zip contains one fasta file with the accession number of the sequence and its full taxonomy on the description line in the UTAX format. It is suitable to use with USEARCH and VSEARCH.

- pr2_version_4.x_taxo_long.zip contains one fasta file with the accession number of the sequence, the name of the sequence and its full taxonomy on the description line. It is suitable to build a local database for BLAST search

- pr2_version_4.x_metadata.zip contains a tabulation separated file with all the metadata from genbank as well as annotation made to the PR2 database.

- PR2 version notes.docx contains the revision history

- PR2 versions.xls is a condensed list of the different versions

- Qiime only use 7 taxonomical levels by default.


Daniel VAULOT, Laure GUILLOU and Fabrice NOT
DIPO team, Plankton Group, UMR 7144 CNRS-UPMC
Station Biologique,
Place G. Tessier
29680 Roscoff FRANCE
email: vaulot@sb-roscoff.fr


- Tristan Biard
- Margot Tragin
- Bente Edvardsen


Guillou, L., Bachar, D., Audic, S., Bass, D., Berney, C., Bittner, L., Boutte, C. et al. 2013. The Protist Ribosomal Reference database (PR2): a catalog of unicellular eukaryote Small Sub-Unit rRNA sequences with curated taxonomy. Nucleic Acids Res. 41:D597–604.

Edvardsen, B., Egge, E.S. & Vaulot, D. 2016. Diversity and distribution of haptophytes revealed by environmental sequencing and metabarcoding – a review. Perspect. Phycol. in press

Tragin, M., Lopes dos Santos, A., Christen, R. & Vaulot, D. 2016. Diversity and ecology of green microalgae in marine systems: an overview based on 18S rRNA gene sequences. Perspect. Phycol. in press.

Note : The PhytoRef (16S plastid database) is available here : https://figshare.com/articles/PhytoREF_a_reference_database_of_the_plastidial_16S_rRNA_gene_of_photosynthetic_eukaryotes_with_curated_taxonomy/4689826