By Ruth A Pagell*
(27 Mar 2024) I spent two months away from Ruth’s Rankings collaborating with a friend on a book chapter on open access that ended up being about open everything. During that time, CWTS Leiden started the new year with a new version of its rankings, using the new hot bibliometric platform OpenAlex rather than Web of Science data for the metrics. The original ranking is labelled Traditional and the OpenAlex model is Open. This article introduces open-access tools used to create OpenAlex and compares university ranking results between the Traditional and Open versions. The article ends with a question about the future of bibliometrics and rankings. See Appendix A for information about the tools mentioned in the article. See Appendix B for data tables.
CWTS LEIDEN OpenAlex MODEL
As reported in the May 2017 News Update CWTS issued 10 principles for the responsible use of university rankings. Principle Four was that the data used for the rankings should be transparent and available to users. (Waltman, Wouters, & van Eck). In September of 2023. I reported that CWTS was working on transitioning to a new model. On January 30, 2024, it released the OpenAlex model, which involves participation with other players in the open bibliometrics industry (See Appendix A). CWTS is using similar standards for inclusion as with the Traditional rankings. It has applied OpenAlex to Scientific Impact, Collaboration, and Open Access.
The question that CWTS asked in launching the new ranking based on open and reproducible data was could this data provide “a transparency and rigor absent from traditional ranking systems”?
Posts in CWTS Blog, Leiden Madtrics, explain the rationale, process, and players involved in creating the open version of the rankings. Much of the curating to map from OpenAlex to CWTS focused on identifying the research organizations and their affiliates. Additional work is still needed (Van Eck, Visser & Waltman).
CWTS has been more proactive than other ranking organizations in making its curated data available for downloading. The new data are available for downloading (Van Eck). An advantage of the Open version is that it allows entities to check the listing of their affiliations. OpenAlex uses PubMed and Cross Ref for author names but name authentication is a recognized weakness.
Those of us familiar with university bibliometric rankings know that the bibliometrics are derived from the subscription services Clarivate’s Web of Science or Elsevier’s Scopus which use curated lists of journals for inclusion. For their research, scholars were also using Microsoft Academic Gra(MAG). It was a free source of bibliometric data and manipulation tools. It was discontinued at the end of 2021 (Chawla, 2021)
OurResearch launched OpenAlex on January 1, 2022, as one tool to fill the void. The initial dataset that CWTS used for its Open version includes 6.8 million articles or reviews based on the Traditional criteria and it used the existing 2023 list of 1411 universities.
COMPARISONS using Traditional and Open Rankings
Those readers interested in the place of bibliometrics in open science should follow the CWTS blog. Articles have been written looking at the makeup of the OpenAlex dataset. An example is a comparison of OpenAlex with Web of Science and Scopus (Culbert). Since this is Ruth’s Rankings, I of course looked at university rankings, spending too much time playing with comparisons of the Traditional and Open rankings.
All CWTS Leiden tables have three columns: the total number of articles overall in the category, the total number of articles on the topic, and the proportion of the articles on the topic. There is no composite score.
Scientific Impact Indicator
The basic indicator is Scientific Impact. The three columns include the number of articles in a four year time period, the number of articles in the top percent (I use 10%) of their field, and the proportion of articles in the top 10%. See Appendix B for the full table. Table 1 compares the top universities in the two versions. Number one in each category did not change although the rank changed.
Open Access Indicators
Listed below are definitions for the categories used in the rankings and a comment on the top 20 institutions. Gold, Hybrid, Bronze, and Green are the standard Open access categories used by other bibliometric aggregators.
OVERALL – The proportion of all publications that are open access
- Traditional: 18 of the top 20 are from the UK, 14 of the bottom 20 are from India, and four are from Iran
- Open: 17 of the top 20 are from the UK
GOLD – Publications in open-access journals
- Traditional: 14 are in medical or life science journals, eight of which are in Poland.
- Open: 13 are in medical or life science journals, seven are from East Asia and six are from Poland.
HYBRID – Publications in a subscription journal that are open with a license for the article to be reused
- Both: All are in Europe, with the Netherlands having more than half in both versions
BRONZE – Publications in a subscription journal that are open without a license for the article to be reused.
- Traditional: 12 from the US and 7 from France.
- Open: 19 of the 20 are from the US
GREEN – Publications in a subscription journal that are not in the journal but in a repository
- Traditional: 18 are from the UK.
- Open: 16 out of 20 are from the UK
TABLE 2 shows the ranges in percentages of articles that are open for each category. The ranges vary based on the type of open access:
Since I always start at who is on top, I decided to look at the performance of a sample of universities ranked in the middle. The main takeaway is not the difference between the two rankings but the differences among universities’ performance across different shades of open. For example, the overall scores for Korea’s Sungkyunkwan University are 832 in the Open rank and 852 in the Traditional rank with just a two percent difference in the overall score. It places 216 in Open Gold. See Table 2 in Appendix B.
For my final comparison, I used the Scientific Impact rankings. I examined the differences in rankings between CWTS and other world rankings, using total articles and proportions of a university’s publications in the top 10 percent of articles in their field. Since CWST does not have a composite score and its default ranking is the number of articles, the tops in proportions ranked more closely to the other world rankings. See Table 3.
CONCLUSION: In doing research for the book chapter, I felt as if quality and authentication had been sacrificed in the rush to open everything. That is not the case in this application. CWTS used the same criteria as its Traditional model offering a way to make the underlying data more accessible to the institutions that are included. They can manipulate the data in ways that are important to them.
Both OpenAlex and CWTS Leiden recognize the need for more curation. Work is needed on author authentication. As we can see in Appendix A, many organizations have to work together. To use tools such as OpenAlex for measuring quality, a new infrastructure, with many players with unfamiliar names, is needed. Expanding access to the scholarly and higher education communities is important. But many questions remain as other rankers are looking at other options:
What does this mean for Clarivate and Scopus and for the traditional university rankings organizations? New initiatives are already in place. In December, the French Ministry of Higher Education and Research (MESR) agreed to a multi-year partnership with OpenAlex (French). In January 2024 Webometrics switched to ROR for its list of institutions, eliminating all the institutions not registered with ROR. OpenAlex sees the possibility of working with other rankings “We think rankings should be built in a transparent and reproducible way, for both practical and ethical reasons”(Priem). CWTS sees the possibility for expanding its dataset from OpenAlex in the future (Waltman email)
Q: Will the tools drive the changes rather than the content?
Q: Where does AI fit into this picture?
Q: Many users of rankings never go beyond who is number one. Will this lead to more misinformation or misunderstood information?
Q: How do we open access to institutions that cannot afford to produce the content that allows them into the rankings and maintain a level of quality? “Questions of quality often intersect with issues of underrepresented places and communities, leading to biases that we are trying to push against. When a scholarly knowledge graph excludes research based on quality, it very often leads to perpetuating these biases, which is at odds with our mission of inclusivity.” (Portnoy)
RESOURCES:
For a teaching guide about OpenAlex and the transition from Mag SEE HKUST’s “Research Bridge” (Gu).
Relevant Ruth’s Rankings’ posts
News Flash (26 May 2017), https://librarylearningspace.com/ruths-rankings-news-flash-cwts-leiden-new-ranking-new-ranking-principles/
News Flash ( 26 June 2018) https://librarylearningspace.com/ruths-rankings-news-flash-cwts-leiden-new-ranking-new-ranking-principles/
News Flash ( 26 June 2018) https://librarylearningspace.com/ruths-rankings-news-flash-cwts-leiden-new-ranking-new-ranking-principles/
New Flash (Mar 2019), CWTS Leiden 2019 adds new metrics for open access and gender, https://librarylearningspace.com/ruths-rankings-news-flash-2019-03-cwts-leiden-2019-adds-new-metrics-for-open-access-and-gender//
RR 56 Part 2 (27 Sept 2023) https://librarylearningspace.com/rr-56-part-2-new-metrics-for-old-ranking-covering-qs-nature-and-cwts-leiden/
Aguillo, I. F. Webometrics January 2024 update, email
Author disambiguation (Nov 2023). OpenAlex’s technical documentation is included. https://docs.OpenAlex.org/api-entities/authors/author-disambiguation
Chawla, D.S. (15 June 2021). Microsoft Academic Graph is being discontinued. What’s next? Nature Index News https://www.nature.com/nature-index/news/microsoft-academic-graph-discontinued-whats-next
Culbert, J..H. (29 Jan 2024). Reference Coverage Analysis of OPENAlex compared to Web of Science and Scopus, https://arxiv.org/abs/2401.16359
French Ministry of Higher Education and Research partners with OpenAlex to develop fully open bibliographic tool ( 15 Feb 2024). Blog. https://www.ouvrirlascience.fr/french-ministry-of-higher-education-and-research-partners-with-OpenAlex-to-develop-a-fully-open-bibliographic-tool/
Gu, J. (22 Feb 2022). Open Alex: Open database of Papers, Authors, Institutions, and more Research Bridge, HKUST https://library.hkust.edu.hk/sc/OpenAlex/
Priem, J., Piwowar, H., & Orr. R.( 17 June 2022). OpenAlex https://arxiv.org/abs/2205.01833; full version https://arxiv.org/ftp/arxiv/papers/2205/2205.01833.pdf
Portnoy, J. (2 Feb 2024). OpenAlex support, email
Van Eck, N.J. (30 Jan 2024). CWTS Leiden Open Edition 2023 -Data, https://zenodo.org/records/10579113
Van Eck, N.J., Visser, M. & Waltman, L. (30 Jan 2024). Opening up the CWTS Leiden ranking: Toward a decentralized and open model for data curation. https://www.leidenmadtrics.nl/articles/opening-up-the-cwts-leiden-ranking-toward-a-decentralized-and-open-model-for-data-curation. Leiden Madtrics
Waltman, L., Wouters, P., van Eck, N.J. (17 May 2017) Ten principles for the responsible use of university ranking. Blog archive. https://www.cwts.nl/blog?article=n-r2q274
Waltman, et.al. (30 Jan 2024). Introducing the Leiden Ranking Open Edition https://www.leidenmadtrics.nl/articles/introducing-the-leiden-ranking-open-edition
Waltman (3 Mar 2024) email
Thanks to Ludo Waltman (CWTS Leiden), Jason Priem (OpenResearch), Jason Portnoy (OpenAlex), and Isidro F. Aguillo for sharing information about open access relative to their companies. One other ranker who did not want to be named said that they were looking at open options.
Ruth’s Rankings
A list of Ruth’s Rankings and News Updates is here.
*Ruth A. Pagell is emeritus faculty librarian at Emory University. After working at Emory, she was the founding librarian of the Li Ka Shing Library at Singapore Management University and then adjunct faculty [teaching] in the Library and Information Science Program at the University of Hawaii. She has written and spoken extensively on various aspects of librarianship, including contributing articles to ACCESS – https://orcid.org/0000-0003-3238-9674