Ranking and mappping of universities and research-focused institutions worldwide: The third release of excellencemapping.net

Bornmann, Stefaner, de Moya Anegón, and Mutz (2014a) have introduced a web application (www.excellencemapping.net) which is linked to both academic ranking lists published hitherto (e.g. the Academic Ranking of World Universities) as well as spatial visualization approaches. The application visualizes the performance of institutions worldwide within specific subject areas (journal sets from Scopus) as institutional ranking lists and on custom tile-based maps. Scopus data were used which are also the base for the SCImago Institutions Ranking. The second, enhanced version of the excellencemapping.net is described in Bornmann, Stefaner, de Moya Anegón, and Mutz (2014b). In this version, the effect of covariates (such as the perceived corruption in a country in which a research institution is located) on two performance metrics (best paper rate and best journal rate) is examined and visualized. This paper describes the third version based on new data.


Introduction
In a list of the most prominent rankings Hazelkorn (2013) names 11 different international rankings. The most important source of data used for the various rankings are abstract and citation databases of peer-reviewed literature This paper has been developed based on the paper presented at the 10 th International Conference on Webometrics, Informetrics and Scientometrics & 15th COLLNET Meeting, held during Sept. [3][4][5]2014 at Ilmenau University of Technology, Ilmenau, Germany.
Ranking and mappping of universities and research-focused institutions worldwide...
(primarily Scopus, which is provided by Elsevier and the Web of Science, WOS, from Thomson Reuters). Publication and citation data is used to make a statement about the productivity and the citation impact of institutions (Bornmann,  Bornmann, et al. (2014a) have introduced a web application (www.excellencemapping. net) which is linked to both academic ranking lists published hitherto (e.g. the Academic Ranking of World Universities) as well as spatial visualization approaches. The application visualizes the performance of scientific institutions worldwide (universities or researchfocused institutions) within specific subject areas (Scopus journal sets) as ranking lists and on custom tile-based maps. The second, enhanced version of the application is described in Bornmann, et al. (2014b). In this version, the effect of covariates (such as the perceived corruption in a country in which an institution is located) on two bibliometric metrics (best paper rate and best journal rate) is visualized. A covariate-adjusted ranking and mapping of the institutions was generated in which the single covariates are held constant. This paper describes the third version which is similar to the second. The third version is based on new data.

Methods
The study is based on Scopus data which are also annually used for the SCImago Institutions Ranking (http://www.scimagoir.com/). In order to include in the application reliable data in terms of geo-coordinates (Bornmann, Leydesdorff, Walch-Solimena, & Ettl, 2011) and performance metrics (Waltman, et al., 2012), we only selected those institutions which have published at least 500 papers (articles, reviews and conference papers) between 2007 and 2011 in a certain subject area (Scopus journal set) 1 . In other words, institutions with fewer than 500 papers in a category are not included in the application. Furthermore, only subject areas offered at least 50 institutions are considered in the application (Arts and Humanities, e.g., is not considered). This threshold was used to have no small number of institutions for a worldwide comparison. We applied the full counting method (Vinkler, 2010) to assign papers from Scopus to institutions: if an author of a paper is associated with an institution, it is assigned to this institution (with a weight of 1). Two indicators are used to measure the performance of the institutions. The first indicator has named as "best paper rate". It is the proportion of the institutional publications which belong to the 10% most frequently cited publications in their subject area and publication year. The second indicator -named as the best journal rate -is the proportion of publications which an institution publishes in the most influential journals worldwide. The most influential journals are those which are ranked in the first quartile (top-25%) of their subject areas as ordered by the indicator SCImago Journal Rank (SJR). While the indicator "best paper rate" informs about the long-term success of publications, the indicator "best journal rate" reflects the ability of an institution to publish in reputable journals. Table 1 shows the number of institutions which are included as datasets for the 17 Scopus subject areas in excellencemapping.net. Out of the total of 27 subject areas in Scopus, Table 1 Number of institutions included in the analyses for 17 different subject areas. The mean best paper rate/best journal rate is the mean over the institutions within a subject area. only those are selected for the web application which have at least 50 institutions worldwide. For example, 541 institutions within the subject area of chemistry were included in the analyses. The mean best paper rate for these institutions is .12 (12%) and the mean best journal rate is .57 (57%). The citation impact of the publications considered in the web application refers to the time period from publication to the beginning of 2014.

Subject area Number of institutions
The data was analysed by using generalized linear mixed model for binomial data, which takes into account the hierarchical structure of data and properly estimates the standard errors (Bornmann, et al., 2014a;Mutz & Daniel, 2007).
The web application was implemented using modern web technologies and Open Street Map 2 data provided through MapBox 3 . It is based on the javascript frameworks backbone. js 4 , jquery 5 and d3.js 6 .

Results
To be able to explain the performance differences among the institutions in the regression model, we included the following variables as covariates. The covariates are also utilized to create a covariate-adjusted ranking of the institutions: (1) Proportion of papers from one institution which were produced in an international collaboration can select a subject area for visualization. Below the subject area selection window, there is another selection window for the covariate (for selecting the number of residents, for example). If a covariate is selected, the probabilities of (i) publishing in reputable journals (best journal rate) or (ii) publishing most-frequently cited papers (best paper rate) is visualized adjusted (controlled) for the selected covariate. Then, the displayed performance of institutions can be interpreted as if all institutions had the same value (reference point) for the selected covariate. We z-transformed each covariate over the whole data set (with M=0 and S=1). Thus, the average probability is the value in which the selected covariate=0, i.e. equivalent to the median. The z-transformation allows the comparison of the results from models with and without the covariates.
One can select one of the two performance indicators (best paper rate or best journal rate) below the selection windows for the subject area and the covariates. The application shows for each indicator the residues from the regression analysis (random effects) converted to probabilities. To have values on the original scale for both performance indicators (i.e. proportion of papers in the top range or published in the top journals), the intercept was added to the residues. One can tick "Show statistically significant results only" to reduce the number of mapped institutions in a subject area to only those institutions which differ statistically significantly in their performance from the mean value. The map on the left-hand side of the web application shows a node for each institution for a selected subject area (e.g. Physics and Astronomy). One can move the map to different regions worldwide and can zoom in (or out). Map details appear only at zoom levels of a certain depth. The node size for each institution is proportional to the number of papers in the selected subject area. For example, the Centre National de la Recherche Scientifique (CNRS) has the largest node (in Europe) in Physics and Astronomy. As many nodes overlap on larger cities (e.g. Paris), one can select all the nodes in a certain region. These institutions are then listed on the right-hand side of the application under "Your selection". The colour of the nodes indicates the indicator values for the institutions using a colour scale from blue through grey to red: If the institutional indicator value is greater than the mean (expected) value across all institutions in a subject area, its node has a blue tint. Nodes with red colours mark institutions with indicator values lower than the mean. Grey nodes indicate values close to the expected value.
All institutions, which are taken into account in the regression model for a subject area (see section "Institutional scores"), are displayed on the right-hand side of excellencemapping.net. The name, the corresponding country, and the number of papers ("Papers") are shown for each institution. Furthermore, the best journal rate or the best paper rate is listed ("Indicator value"). The greater the confidence interval of the probability, the more unreliable it is. If the confidence interval for an institution does not overlap with the mean proportion across all institutions (the mean is the short line in the middle of "Indicator value"), the institution has published a statistically significantly higher (or lower) best paper or best journal rate than the average across all institutions (α = 0.165). The listed institutions can be sorted (in descending or ascending order in the case of numbers). Thus, the top or worst performing institutions in a subject area can be identified by clicking on "Indicator value." Clicking on "Papers" puts the institutions with a high paper output at the top of the list (or at the end). In Biochemistry, Genetics and Molecular Biology, for example, CNRS is the institution with the highest paper output between 2007 and 2011; in terms of the best paper rate, Broad Institute of MIT and Harvard are the best-performing institution. The column farthest on the right (named as "Δ rank") in the section "Institutional Scores" shows for an institution by how many rank places it goes up (green, arrow pointing upwards) or goes down (red, arrow pointing downwards), if one selects a certain covariate. For example, the Institute for High Energy Physics (RUS) improves its position by 5 places compared to the ranking which does not take the covariate "corruption perception index" into account in the Physics and Astronomy subject area. The ranking differences in this column always relate to all the institutions included. The differences do not therefore change if one looks at only the statistically significant results.
If a covariate has been selected, one can, for example, sort the institutions by Δ rank. This puts the institutions which benefit most from the covariate at the top of the list. Using the search field at the right, one can find a specific institution in a subject area. In order to identify the institutions for a country, one can click on "Country". Then, the institutions are sorted by country and -within a country -by the indicator value. The section "Your selection" is intended for the comparison of institutions which are of interest for the user. If the confidence intervals of two institutions do not overlap, they differ statistically significantly on the 5% level (in the best paper or best journal rate, respectively). For example, in Physics and Astronomy, Stanford University and the Helmholtz Association are shown without overlap (publication years 2007 to 2011).
The institutions in "Your selection" can be sorted by each heading. These institutions are also marked on the map on the left-hand side with a black border. Thus, the section "Your selection" links both, institutional lists and institutional maps. For the institutional comparison, one can select institutions in the list or on the map. By clicking on "Clear", one starts a new comparison of institutions.

Discussion
Following Bornmann and Leydesdorff (2011), our underlying statistics (multi-level regression models) are analytically oriented. They enable (1) the estimation of statistically more appropriate values for the performance indicators than the observed values; (2) the calculation of confidence intervals which can be interpreted as measures of reliability for the institutional performance; (3) the comparison of an institution with an "average" institution in a specific subject area and (4) the direct comparison of two or more institutions. (5) By taking covariates into account when mapping and ranking institutions an adjusted view on institutional research performance is possible. For example, with our application it is possible to look at the performance of institutions worldwide in countries with the same financial background (that is, the same GDP). This highlights institutions showing a relatively high performance despite a bad financial situation in the country (Bornmann, 2014).

Conclusions
The third, substantially enhanced version of the web application visualizes institutional performance as ranking lists and on custom tile-based maps. The effect of covariates (such as the perceived corruption in a country) on two performance metrics (best paper rate and best journal rate) is visualized. The web application can be used, e.g., by students, (postdoctoral) researchers and people working in the area of science policy to identify scientifically interesting institutions worldwide.