Normalization of Mendeley reader counts for impact assessment

A different number of citations can be expected for publications appearing in different subject categories and publication years. For this reason, the citation-based normalized indicator Mean Normalized Citation Score (MNCS) is used in bibliometrics. Mendeley is one of the most important sources of altmetrics data. Mendeley reader counts reflect the impact of publications in terms of readership. Since a significant influence of publication year and discipline has also been observed in the case of Mendeley reader counts, reader impact should not be estimated without normalization. In this study, all articles and reviews of the Web of Science core collection with a publication year of 2012 (and a DOI) are used to normalize their Mendeley reader counts. A new indicator that determines the normalized reader impact is obtained –the Mean Normalized Reader Score (MNRS) – and compared with the MNCS. The MNRS enables us to compare the impact a paper has had on Mendeley across subject categories and publication years. Comparisons on the journal and university level show that the MNRS and MNCS correlate larger for 9601 journals than for 76 German universities.


Introduction
Estimating the citation impact of scientists, research groups, and institutes in different disciplines and time periods faces the problem that discipline and time period influence the citation impact of publications independently of the quality of the publications. Normalization for both factors started in the mid-1980s (Schubert & Braun, 1986). Only since normalized values were obtained did it become possible to assess the citation impact of entities such as researchers or universities across disciplines and time periods. In the calculation of a normalized impact value for a publication, the total number of citations of the publication is counted (times cited).
The number of times cited is compared with the citation impact of publications with the same publication year, subject category, and document type (expected impact of the reference set).
Although other methods have been developed in recent years (e.g., normalization on the side of the citing publications, Zitt & Small, 2008), this method is the most established and used in bibliometrics.
In recent years, impact evaluation in scientometric research has been done not only on the basis of citations but also based on alternative metrics (altmetrics) (Borrego, 2014;Mohammadi & Thelwall, 2014;Torres-Salinas, Cabezas-Clavijo, & Jimenez-Contreras, 2013;Priem, 2013;Priem, 2014). Altmetrics open the possibility to assess the impact of research faster than with citations. Moreover, altmetrics seem suitable to determine the impact of research in a broader manner than with citations (Aguinis, Shapiro, Antonacopoulou, & Cummings, 2014;Bar-Ilan et al., 2012;Bornmann, 2014;Dinsmore, Allen, & Dolby, 2014;Hammarfelt, 2014;Priem, Taraborelli, Groth, & Neylon, 2010). While citations quantify only the impact of research on science, altmetrics could be able to quantify the impact of research on all aspects of society, including science. Current scientometric research studies if this hope is more than a working hypothesis.
Data from Mendeley are among the most important sources for altmetrics: "Mendeley is both a citation management tool and social network for scholars with over two million users" (Rodgers & Barbrow, 2013, p. 12). One basic assumption behind the use of such data in an evaluative context is that a Mendeley reader who adds a publication to his/her library can be counted as a reader of the publication. Indeed, the results of Mohammadi, Thelwall, and Kousha (in press) show that "82% of the Mendeley users had read or intended to read at least half of the bookmarked publications in their personal libraries." Therefore, Mendeley counts are seen as a very promising possibility to quantify the size of the readership of a paper inside as well as outside of science. Furthermore, a Mendeley reader can be seen as a precursor to a citer, as Mendeley users include a publication into their library when they intend to cite it in a forthcoming manuscript. However, each Mendeley user is counted as one reader, while it is possible that he (or she) will cite the publication multiple times or not at all.
Several studies have shown that the Mendeley reader impact-similar to the citation impact, although there are differences between the two-varies across scientific disciplines (Jeng, He, & Jiang, 2015;Thelwall & Maflahi, 2015;Zahedi, Costas, & Wouters, 2014;Zahedi & Eck, 2014). In one discipline, papers are read more often on average (or papers are more frequently included in the user's Mendeley library) than in other disciplines. These variations are not only specific to Mendeley data but also to other altmetric sources, e.g., Twitter counts (Haustein, Costas, & Lariviere, 2015). Moreover, publications with different document types and publication years receive different numbers in Mendeley reader counts (Haustein & Lariviere, 2014). Therefore, in almost the same manner as for citation counts, Mendeley reader counts should be normalized with respect to publication year, document type, and scientific discipline before an interpretation is attempted. The aim of this study is to apply the most established method of normalization in bibliometrics to the field of altmetrics and propose a normalization scheme for Mendeley reader counts (very recently, a similar approach has been suggested by Fairclough & Thelwall, 2015, which focuses on country comparisons only).

Methods: Description of the Data Set
It is common practice in scientometrics to evaluate the impact of articles and reviews.
Other document types are usually not included in evaluative bibliometrics (Moed, 2005 In total, 9,352,424 reader counts were found for the articles and 1,335,764 reader counts for the reviews. For 118,167 articles (10.4%) and 4,348 reviews (6.7%), we found the paper at Mendeley but without a reader. Papers without any readers indexed by Mendeley may originate from a former reader who removed the paper from his/her library or closed his/her Mendeley account. If a Mendeley user includes too few bibliographic data for a paper in his/her library, he/she is not counted as a reader either, because there is insufficient information to link him/her to a Mendeley database entry. Also, Mendeley adds papers to the database without any reader in the first place from publisher feeds. Therefore, Mendeley reader counts should be excluded in this study, or, if they are included, the papers not found at Mendeley should also be counted as papers with zero readers. We tested both approaches and found no significant differences regarding the scope of this study. In the end, we decided to include the papers with zero readers as well as the papers we did not find in the Mendeley API. This is consistent with the way citations are handled in bibliometric databases. The requests to the Mendeley API were made from December 11-23, 2014. All data in this study are based on a partial copy of our in-house database (last updated on November 23, 2014) supplemented with the Mendeley reader counts.

Differences in reader impact between subject categories
Like the citation distribution (Albarran, Crespo, Ortuno, & Ruiz-Castillo, 2011;Rodriguez-Navarro, 2011;Seglen, 1992), the reader distribution is skewed across subject categories, as shown in Figure 1 for articles and Figure 2 for reviews.  The reader distribution across the categories ranges from 0.22 readers per paper in "Poetry" (not shown in Fig. 1; see Table A.1 in the Appendix) to 27.94 readers per paper in "Evolutionary Biology" (WoS category "ht" in Fig. 1). The 30 most highly populated WoS categories (12% of the 251 categories) comprise 50% of the readership of the articles studied here.
For reviews, the highest number of readers per paper (85.22) is found in the WoS category "Psychology, Experimental" (WoS category "vx" in Fig. 2), while the lowest (non-zero) number (0.27) is in "Literature" (not shown in Fig. 2; see Table A.1 in the Appendix). Fifty percent of the readers are found in the 18 (7.5% of the 239 categories) most highly populated WoS categories.
Usually, reviews are cited more often than articles. This seems to hold true for Mendeley reader counts. Figures 1 and 2 as well as Table A.1 in the Appendix show that reviews are also read more often than articles, on average. The overall average value of readers per paper is 8.25 for articles and 20.56 for reviews. This shows the need for a normalization of Mendeley reader counts with respect to subject categories and document types. Thus, the normalization procedure is done separately for articles and reviews.

Normalization of reader impact
For normalization purposes in bibliometrics, the citation impact of a focal paper is compared with the expected citation impact. The expected value is the average citation impact in the same discipline, publication year, and document type as the paper in question. Sometimes, publications of different document types are considered together in the calculation of the expected value (as in the Leiden Ranking) or separately (as in InCites from Thomson Reuters, SciVal from Elsevier, and the Institutions Ranking published by the SCImago group). The paper set determining the expected value is referred to as the reference set. The ratio of observed and expected citations is the Normalized Citation Score (NCS). Currently, the NCS is the established standard in bibliometrics to normalize citation impact. A NCS of 1 for a publication indicates an average citation impact. A NCS of 1.5 can be interpreted as a citation impact that is 50% higher than the average (Waltman, van Eck, van Leeuwen, Visser, & van Raan, 2011a; Eck, van Leeuwen, Visser, & van Raan, 2011b).
If a paper has been assigned (e.g., by a database provider such as Thomson Reuters) to more than one subject category, the average value of all NCS values is used (resulting in a mean NCS). The assignment of papers to multiple subjects leads to an average value over all papers of a publication year that differs from 1. To alleviate this disadvantage, one can employ fractional counting (Waltman & Eck, 2014), multiplicative counting (Herranz & Ruiz-Castillo, 2012), or full counting with a scaling of all NCS values (Haunschild & Bornmann, submitted). We decided to use the multiplicative counting method, although other counting methods could be used, too.
We do not expect different conclusions of our study if other reasonable counting methods were to be used. Following the definition of the MNCS, we propose a Mean Normalized Reader Score (MNRS) using the multiplicative counting method for papers categorized to multiple subjects.
Our normalization procedure for Mendeley reader impact starts with the calculation of the average number of Mendeley readers per paper (ρ c ) in each WoS category (cf. Fig. 1 and Table   A.1 in the Appendix): Here, R ic denotes the raw Mendeley reader count of paper i, which has been assigned to WoS category c, and N c is the number of papers assigned to WoS category c. Afterwards, the raw Mendeley reader count is divided by the average number of Mendeley reader counts per paper in WoS category c (ρ c ), yielding the normalized reader score (NRS) for paper i in subject category c:

=
The average value over all NRS values equals exactly one (due to the multiplicative counting method).
Since we include only papers that were published in 2012, the publication year does not have to be included in the normalization procedure. The overall reader impact of a specific aggregation level (e.g., researcher, institute, or country) can be analyzed on the basis of the mean value over the paper set. This results in the mean NRS (MNRS) for the paper set.
As an illustrative example, we show a step by step calculation of the NRS for the article with DOI 10.1061/(asce)co.1943-7862.0000464. We recorded a reader count of 8 for this article.
The article is classified in the WoS subject categories "Constructions & Building Technology" (fa), "Engineering, Civil" (im), and "Engineering, Industrial" (ij). The average reader counts for these WoS subject categories are 9.30, 7.51, and 11.69, respectively. Therefore, we obtain NRS fa = 0.86, NRS im = 1.07, and NRS ij = 0.68. This paper has a reader impact slightly above average in the category "im" but below average in the categories "fa" and "ij." Using the multiplicative counting method, papers assigned to multiple categories do not have a single impact value. For example, if this paper belongs to the publication set of a country and the MNRS is calculated for this set, the paper is not considered only once but three times (with potentially different impact values in each subject category) in the calculation of the average NRS value. Table 1 shows the thresholds of NRS values that a paper has to have at least in order to become a top 1% and top 10% paper. The differences between the NRS thresholds for reviews and articles are smaller for the top 10% than for the top 1%.

Normalized reader impact of journals
Papers from 9,563 journals out of the 12,334 WoS journals in 2012 are covered in the papers found at Mendeley. Table 2 Table 2 (the MNCS is also based on the multiplicative counting method and the same data set as the MNRS). The MNRS correlates much larger than typical with the MNCS, which is indicated by the Spearman rank coefficient of 0.70 (for the interpretation of correlation coefficients, see Cohen, 1988 Table 3.

Normalized reader impact of universities
Obviously, the papers published by the universities have been more frequently read than cited.  indicators, an MNRS between 0.8 and 1.2 should be regarded as an average impact, while one above 1.2 and below 0.8 should be regarded as above and below average impact, respectively.
In addition to normalization methods which are based on average reader scores (MNRS), one can also use percentile-based approaches. Percentile-based approaches produce more robust normalized scores because they are not based on average values (of citations or readers) (Wilsdon et al., 2015). The percentile impact is the proportion of papers in a reference set that are cited or read, respectively, equal to or less often than the paper in question (Bornmann, Leydesdorff, & Mutz, 2013). We made one step in the direction of calculating percentiles by determining the MNRS values that are needed for a paper to be among the top 1% and top 10% most-read papers.
What are the limitations of our study? One can see our retrieval strategy on Mendeley reader counts via the API using the DOI as a limitation. However, as we found the vast majority of papers (94.8% of articles and 96.6% of reviews) with this method, we expect no major influence on our results from this retrieval strategy. Another limitation of our study is the exclusion of articles and reviews without a DOI. This reduces the number of publications from 1,390,504 to 1,198,184. Therefore, 86.2% of the articles and reviews of the WoS core collection of 2012 are included in this study. Under the assumption that publications with and without a DOI are rather evenly distributed across high and low impact publications, this will not alter our results significantly. This study is not intended to provide reader counts as reference values for later use. The main aim is to explore an established method from bibliometrics in the realm of altmetrics and to propose a method to normalize Mendeley reader counts in order to judge the reader impact of individual publications as well as aggregates of publications.
The normalization procedure proposed in this study can in principle also be applied to other altmetrics data, such as tweets and blog posts, as it relies on an external classification system assigned to individual papers (or journals where the paper was published). However, normalization with respect to other altmetrics data requires that a large proportion of the publication set is covered by the specific type of altmetrics source. This is the case for Mendeley reader counts, but this might not be the case for other sources of altmetrics data.
The interpretation of Mendeley reader counts seems to be more problematic than the interpretation of citation counts. Many scientists do not read papers in the Mendeley application or on the web interface. Often, scientists add a paper to their Mendeley library when they intend to read it. Although there are reasons to include a paper in one's Mendeley library other than reading it later, it has been proposed to interpret Mendeley reader counts as the number of citers to be (see above).

Conclusions
In this study, we have proposed a method for the normalization of Mendeley reader counts that is based on an established method of normalization for citation counts. This method seems to be able to normalize reader counts. A rather high correlation was found between the MNRS and MNCS values.
Since the MNRS has been derived from a well-known and accepted variant used in bibliometrics, the conceptual justification of the new indicator seems to be given.