Ecological Archives E094-045-D1
Giovanni Strona, Maria Lourdes D. Palomares, Nicolas Bailly, Paolo Galli, Kevin D. Lafferty. 2013. Host range, host ecology, and distribution of more than 11 800 fish parasite species. Ecology 94:544. http://dx.doi.org/10.1890/12-1419.1
Introduction
Metadata
Class I. Data set descriptors
A. Data set title: Fish Parasite Ecology Database
B. Data set identification code: -
C. Data set description
The data set includes 38008 records describing the distribution of 11802 fish parasite species belonging to the major helminth groups (Acanthocephala, Cestoda, Monogenea, Nematoda, Trematoda) on 4650 host species. Each entry of the data set is paired to major ecological, biogeographical, and phylogenetic features (maximum length, growth rate, life span, age at maturity, trophic level, habitat preference, range size, taxonomy) of the corresponding host species.
Principal Investigators:
Giovanni Strona, Department of Biotechnology and Biosciences, University of Milano-Bicocca, Milan 20126, Italy.
Kevin D. Lafferty, U.S. Geological Survey, Western Ecological Research Center, Marine Science Institute, University of California, Santa Barbara, California 93106, USA.Abstract: The present data set includes 38008 fish parasite records (for Acanthocephala, Cestoda, Monogenea, Nematoda, Trematoda) compiled from scientific literature, Internet databases, and museum collections paired to the corresponding host ecological, biogeographical, and phylogenetic traits (maximum length, growth rate, life span, age at maturity, trophic level, habitat preference, geographical range size, taxonomy). The data focus on host features, because specific parasite traits are not consistently available across records. For this reason, the data set is intended as a flexible framework able to extend the principles of ecological niche modeling to the host–parasite system, providing researchers with the data to model parasite niches based on their distribution in host species and the associated host features. In this sense, the database offers a framework for testing general ecological, biogeographical, and phylogenetic hypotheses based on the identification of hosts as parasite habitat. Potential applications of the data set are, for example, the investigation of species–area relationships or the taxonomic distribution of host-specificity. The provided host–parasite list is that currently used by Fish Parasite Ecology Software Tool (FishPEST, http://purl.oclc.org/fishpest), which is a website that allows researchers to model several aspects of the relationships between fish parasites and their hosts. The database is intended for researchers that wish to have more freedom to analyze the database than currently possible with FishPEST. However, for readers that have not seen FishPEST, we recommend using this as a starting point for interacting with the database.
D. Key words: FishPEST; host range; host specificity; parasite species richness.
Class II. Research origin descriptors
A. Overall project description
World list of helminth fish parasites and their hosts compiled from published sources, with ecological data of host species.
B. Research Motivation
Fish parasitologists seek to understand how host ecological features may affect the distribution of parasites (Kennedy 2009). Although it is commonly assumed that host ecological features are important in shaping parasite communities, there is no agreement in which factors are most relevant (see, for example, Vignon and Sasal 2010, Paterson et al. 2012). This may be due to heterogeneity in the data sets used in different analyses, or to biases related to unequal sampling, or to lack of data. The database provided here is meant to offer fish parasitologists a single large collection of parasite data to be used to test the relevance of ecological factors at different biogeographical, environmental, and phylogenetic levels. These data have already been implemented in FishPEST (Strona and Lafferty 2012 a,b,c), and are provided here in order to make them available for tailored analyses.
C. General Methodology
Parasitological data were obtained from literature, museum collections, and Internet databases. Our choice of sources was a trade-off between the size of the checklists taken under consideration and their availability in electronic format (which helped ensure precision in the compilation of the database). In particular, we used the following sources: Hewitt and Hine (1972), Williams and Bunkley-Williams (1996), Holland and Kennedy (1997), Kohn and Cohen (1998), Gibson et al. (2005), Kohn et al. (2006), Salgado-Maldonado (2006), Cohen and Kohn (2008), Salgado-Maldonado (2008), Strona et al. (2009), Lichtenfels et al. (2012), and Harris et al. (2008). Parasite scientific names were validated according to Catalogue of Life (Bisby et al. 2012) and WoRMS (Appeltans et al. 2012), while host scientific names were validated according to FishBase (Froese and Pauly 2012). Invalid synonyms where replaced with the correspondent current valid names. Any ambiguous record or any record at a super-specific level was excluded. When available, geographical information of host–parasite records was used to identify the distribution of each parasite species according to major biogeographical regions (see Fig. 1 for region names and boundaries).
FishBase Species Ecology Matrices (Froese and Pauly 2012) were used to obtain ecological information for the host species (see http://www.fishbase.org/manual/Key%20Facts.htm for additional details). Data in the Species Ecology Matrices were extracted using a script based on the Python HTML/XML parser Beautiful Soup (<http://www.crummy.com/software/BeautifulSoup/>). Of the fish ecology parameters available from the Species Ecology Matrices, our data set includes: maximum length (maxL), growth rate (rate at which the asymptotic length is approached, K), life span (Y), age at first maturity (Ym), and trophic level (T). Habitat information (i.e., preference for freshwater, brackish, or marine environment) for each host species was also collected from FishBase.
Geographical range was calculated for about 70% of the host species included in the database using point data retrieved from The Ocean Biogeographic Information System (OBIS, Vanden Berghe, 2007). Range size for each species was estimated using a measure of area of occupancy (AOO) and two measures of extent of occurrence (respectively latitudinal and longitudinal range). Areas of occupancy were calculated as follows: for each species, we plotted all available point records on a global grid of 1×1° Lat/Lon and then we counted the number of grid cells where the species is known to occur. Latitudinal range (Lat) was calculated as the difference between maximum and minimum latitude of species occurrence (in degrees). Longitudinal range (Lon) was calculated as the difference between maximum and minimum longitude of species occurrence (in degrees).
D. Data Limitations and Potential Enhancements
Selecting data sources is a major issue in assembling comprehensive data sets. We left out some valuable checklists that were not in electronic form, such as that provided by Moravec (2001). Thus, despite its remarkable size, the data set provided here is far from complete and should be considered as a framework for organizing fish parasite data and making them available to researchers. We will keep the present data set synchronized with that used by FishPEST, whose internal information can be easily extended by a simple interface that users can access to submit their own host–parasite records. These records will be validated with the same procedure as above prior to being added to the internal database. Therefore, we encourage readers to not only use this database but add to it as well. Moreover, we are currently working to systematically extract data from some of the above-mentioned sources not readily available in electronic format, such as the Love and Moser's checklist (1983).
Another important point concerns data validation. Our procedures left around 20000 validated host–parasite records and around 40000 host–parasite records where only host names were validated. Thus, about half of the records included in the database were validated for the names of the hosts but not for those of the parasites. This does not mean that these records are incorrect, but it indicates that the unvalidated parasite species were not present either in Catalogue of Life, or in WoRMS. Note that both these sources are constantly updated with the addition of new entries. The number of parasite species remaining to be validated is therefore likely to decrease together with future updates of these databases.
A major limitation of our list is related to the absence of an indication of larval verses adult stages for the listed parasite species. This is due to the heterogeneity of the original data sources, which made it virtually impossible to automate a categorization process for the different life stages. However, because larvae are difficult to identify, most of the records referring to larval stages in the original sources were not to species, and were therefore not included in the final list. Additionally, biological information for the parasites can often be used to infer the life stage of the considered records (for example, all the Monogenea, which are monoxenous parasites, reach adulthood on their fish hosts). Nonetheless, we hope that future improvements in the available information can lead to a formal distinction of parasite life stages.
Class III. Data set status and accessibility
A. Status
Latest update: April 2012
Latest metadata update: There have been no alterations to the metadata subsequent to first publication.
B. Accessibility
Contact person: Giovanni Strona, Department of Biotechnology and Biosciences, University of Milano-Bicocca, Milan, 20126, Italy; giovanni.strona@unimib.it.
Copyright restrictions: None.
Proprietary restrictions: None.
Costs: None.
Class IV. Data structural descriptors
COMMUNITY DATA
A. Data Set File
Identity: FPEDB.csv
Size: 38008 records, 5,656,782 Bytes.
Format and storage mode: ASCII text, comma delimited.
Header information: The first row of the file contains the variable names. See section B below for detailed descriptions of the column contents.
Alphanumeric attributes: Mixed.
Special characters/fields: If no information is available for a given record, this is indicated by 'na'.
Authentication procedures: MD5 Checksum for the file: 88b66dcf0be2832e9682a73305d925ad
B. Variable information
Variable name |
Variable definition |
Units |
Storage type |
Variable codes and definitions |
Missing value codes |
ID |
Record ID code (starting from 00000) |
na |
Integer |
na |
na |
P_T |
Parasite taxonomic group |
na |
Character |
A = Acantocephala; |
na |
P_F |
Parasite family |
na |
Character |
na |
na |
P_SP |
Parasite species |
na |
Character |
na |
na |
H_C |
Host class |
na |
Character |
na |
na |
H_O |
Host order |
na |
Character |
na |
na |
H_F |
Host family |
na |
Character |
na |
na |
H_SP |
Host species |
na |
Character |
na |
na |
GEO |
Biogeographical region |
na |
Character |
AFR = Africa; |
na |
MaxL |
Maximum host body length |
cm |
Fixed point |
na |
na |
K |
Host growth rate (K) |
1/years |
Fixed point |
na |
na |
Y |
Host life span |
years |
Fixed point |
na |
na |
Ym |
Host age at first maturity |
years |
Fixed point |
na |
na |
T |
Host trophic level |
na |
Fixed point |
na |
na |
F |
Freshwater |
na |
Integer |
0 = no; 1 = yes |
na |
B |
Brackish |
na |
Integer |
0 = no; 1 = yes |
na |
M |
Marine |
na |
Integer |
0 = no; 1 = yes |
na |
AOO |
Area of occupancy |
degrees |
Integer |
na |
na |
LAT |
Latitudinal range |
degrees |
Fized point |
na |
na |
LON |
Longitudinal range |
degrees |
Fixed Point |
na |
na |
Note: Asterisks beside the parasite taxonomic group characters indicate that the corresponding records were validated for the names of the hosts but not for those of the parasites (see Class II – B).
Additional details about fish ecology variables (MaxL, K, Y, Ym, and T) are available from FishBase documentation at <http://www.fishbase.org/manual/Key%20Facts.htm>.
Acknowledgments
Any use of trade, product, or firm names in this publication is for descriptive purposes only and does not imply endorsement by the U.S. government.
Literature cited
Appeltans, W., P. Bouchet, G. A. Boxshall, K. Fauchald, D. P. Gordon, B. W. Hoeksema, G. C. B. Poore, R. W. M. van Soest, S. Stöhr, T. C. Walter, and M. J. Costello. 2012. World Register of Marine Species, accessed 15/02/12. http://www.marinespecies.org.
Bisby, F., Y. Roskov, A. Culham, T. Orrell, D. Nicolson, L. Paglinawan, N. Bailly, W. Appeltans, P. Kirk, T. Bourgoin, G. Baillargeon, and D. Ouvrard. 2012. Species 2000 and ITIS Catalogue of Life, accessed 15/02/12. http://www.catalogueoflife.org/col/.
Cohen, S., and A. Kohn. 2008. South American Monogenea - list of species, hosts and geographical distribution from 1997 to 2008. Zootaxa 1924:1–42.
Froese, R., and D. Pauly. 2012. FishBase. World Wide Web electronic publication, accessed 15/02/12. http://www.fishbase.org.
Gibson, D. I., R. A. Bray, and E. A. Harris. 2005. Host-parasite database of the Natural History Museum, London, accessed 15/02/12. http://www.nhm.ac.uk/.
Harris, P. D., A. P. Shinn, J. Cable, T. A. Bakke, and J. E. Bron. 2008. GyroDb: gyrodactylid monogeneans on the web. Trends in Parasitology 24:109–111.
Hewitt, G. C., and P. M. Hine. 1972. Checklist of parasites of New Zealand fishes and of their hosts. New Zealand Journal of Marine and Freshwater Research 6:69–114.
Holland, C. V., and C. R. Kennedy. 1997. A checklist of parasitic helminth and crustacean species recorded in freshwater fish from Ireland. Biology and Environment: Proceedings of the Royal Irish Academy 3:225–243.
Kennedy, C. R. 2009. The ecology of parasites of freshwater fishes: the search for patterns. Parasitology 136:1653–1662.
Kohn, A., and S. Cohen. 1998. South American Monogenea - list of species, hosts and geographical distribution. International Journal for Parasitology 28:1517–1554.
Kohn, A., S. Cohen, and G. Salgado-Maldonado. 2006. Checklist of Monogenea parasites of freshwater and marine fishes, amphibians and reptiles from Mexico, Central America and Caribbean. Zootaxa 1289:1–114.
Lichtenfels, R., E. P. Hoberg, and P. A. Pilitt. 2012. U.S. National Parasite Collection. U.S. Department of Agriculture, Agricultural Research Service, Biosystematics and National Parasite Collection Unit. Beltsville Maryland, accessed 15/02/12. http://www.anri.barc.usda.gov/bnpcu/parasrch.asp.
Love, M. S., and M. Moser. 1983. A checklist of parasites of California, Oregon and Washington marine and estuarine fishes. NOAA Tech. Rep. NMFS SSRF-777, Seattle, Washington, USA.
Moravec, F. 2001. Checklist of the metazoan parasites of fishes of the Czech Republic and the Slovak Republic (1873-2000). Academia, Prague, Czech Republic.
Paterson, R. A., C. R. Townsend, D. M. Tompkins, and R. Poulin. 2012. Ecological determinants of parasite acquisition by exotic fish species. Oikos 11:1889–1895.
Salgado-Maldonado, G. 2006. Checklist of helminth parasites of freshwater fishes from Mexico. Zootaxa 1324:1–357.
Salgado-Maldonado, G. 2008. Helminth parasites of freshwater fish from Central America. Zootaxa 1915:29–53.
Strona, G., F. Stefani, and P. Galli. 2009. Monogenoidean parasites of Italian marine fish: an updated checklist. Italian Journal of Zoology 77:419–437.
Strona, G., and K. D. Lafferty. 2012a. How to catch a parasite: Parasite niche modeler (PaNic) meets Fishbase. Ecography 35:481–486.
Strona, G., and K. D. Lafferty. 2012b. FishPEST: an innovative software suite for fish parasitologists. Trends in Parasitology 28:123.
Strona, G., and K. D. Lafferty. 2012c. Predicting what helminth parasites a fish species should have using Parasite Co-Occurrence Modeler (PaCo). Journal of Parasitology (in press). DOI: 10.1645/GE-3147.1.
Vanden Berghe, E. 2007. The Ocean Biogeographic Information System: web pages, accessed 15/02/12. http://www.iobis.org.
Vignon, M., and P. Sasal. 2010. Multiscale determinants of parasite abundance: A quantitative hierarchical approach for coral reef fishes. International Journal for Parasitology 40:443–451.
Williams, J. E. H., and L. Bunkley-Williams. 1996. Parasites of offshore big game fishes of Puerto Rico and the western Atlantic. Puerto Rico Department of Natural and Environmental Resources and the University of Puerto Rico, San Juan, Puerto Rico, 383 p.