Ruth’s Rankings 3. Bibliometrics: What We Count and How We Count

By Ruth A. Pagell1

(4 September 2014) Looking back at the Asian university rankings in Table 1 in Ruth’s Rankings 1, each top 10 list is somewhat different.  All of these rankings include some quantitative research metrics, ranging from a high of 100% for the ranking to a low of less than 20%.  In order to understand the rankings it is important to know what is measured and how it is measured.

Using Bibliometric Methodology

Eugene Garfield introduced Science Citation Index in 1955.  By 1963, he recognized the importance of citations in the evaluation of publications.  Prichard (1969) coined the term “bibliometrics” to mean the quantitative analysis and statistics to scholarly outputs, such as journal articles, citation counts, and journal impact.   September 1978 marked the debut of the journal Scientometrics, which publishes the highest number of articles on the topic.  Hood and Wilson (2001) examined the similarities and differences among bibliometrics, scientometrics, infometrics and informetrics.  Electronic access to publications from ISI, now Web of Science, and the appearance of Elsevier’s Scopus, enabled the worldwide comparison of institutions, journals and authors. 

Counting publications looks straightforward:

How many articles are attributed to an institution; and

how many times have other articles cited those articles?

and which of these are in the top 1% of their fields and;

which citations are in high – impact journals?

This article focuses on the highlighted metrics and the variety of ways the global rankings count them at an institutional level.  Publications and citations are output measures of quantity.  Database providers and researchers continue to add new metrics that quantitatively measure quality.  The next article, Ruth’s Rankings 4, will look at the companies supplying the underlying data, Thomson Reuters and Elsevier/Scopus.  After that, we will examine the different rankings and their individual methodologies.

Publications

Publications are the underlying metric, even in those rankings that do not use them directly in their methodology.

Variables for Publications

When it comes time to understanding the difference among the rankings we need to ask the following questions:

What is the definition of a qualifying publication?:  According to my count, I have written over 100 publications from 1997 through 2014, including a couple of books, scholarly articles, practical articles, columns, news and proceedings plus material in a university repository.  Ranking exercises cover articles in scholarly journals, using selected time periods from a rolling 11 years to current year.

Is the number of publications size dependent or size independent?  In the United States, the University of Michigan has over 4,000 faculty and the Massachusetts Institute of Technology has less than 1,000.   Not surprisingly, over the past five years, Michigan published more scholarly articles than MIT.   The total is dependent on the size of the faculty.  Calculating articles per faculty, a size independent measure, MIT has over three times more articles per faculty.

How are multiple authors from multiple institutions handled?  The globalization of higher education in the past twenty years has changed the trends in academic publishing. Early versions of the citation indexes only presented the first author.  Today multiple authors, not only from multiple institutions but also from multiple countries, write one article and list authors’ names alphabetically.   An article on body-mass index for Asian populations published in The Lancet ten years ago lists 28 authors from 4 western countries and 13 Asia-Pacific countries from over 20 institutions.

The base for the rankings is faculty at the institutional level.  This raises two other questions.  In library jargon, is authority control applied to institutional names?  How is faculty defined?

Citations

Even using the most generous of citation sources, Google Scholar, over half my publications are uncited, some are self-citations,  my most highly cited is a book, which does not count,  and others are in journals that are not considered scholarly or “high impact”.

Variables for Citations

The body of works considered for citations is dependent on the definitions above of qualifying articles and distribution among multiple institutions. The most important variable in citation counting is the subject area.    Criticisms of the earliest global rankings were that they did not factor in these differences.  For example, out of 232 categories, Biochemistry and Molecular Biology has over three million citations; category 229, Ethnic Studies, has eight thousand. Information & Library Science comes in at 179 with over 71,000 citations (from WOS 2103 JCR accessed September 1, 2014).

To account for these differences rankers now apply normalization algorithms.   Another way that rankers handle these differences is to include separate ranking by broad subject areas.

OTHER METRICS

Publications and citations are the two main indicators, derived from data in either Thomson Reuters’ Web of Science or Elsevier’s Scopus.   These two vendors report more sophisticated metrics and are adding visualization tools.  The ranking organizations are also calculating different modifications to publications and citations.

Highly Cited Papers

The number of times scholarly publications cite an article is not enough for some rankers.  In order to clearly identify excellence, “highly cited papers” may be a separate category.  Being highly cited is dependent on both the  broad topic and publication date.  The most highly cited Chinese paper in chamistry, published in 2009, has 1,772 citations while the most highly cited Chinese paper in social studies, published in 2005, has 413 (from Essential Science Indicators, September 2014).

Journal Impact

A metric not widely used in rankings but often used by institutions and departments is journal impact. When evaluating candidates for hiring, promotion and tenure, the evaluators are interested not only in the number of papers and citations but also in the quality of the journals in which the article appears.

Journal impact scores serve as quality metrics in predefined fields by comparing citation rates over a fixed time using a variety of metrics.  The impact for an individual journal is dependent on the other journals in the field, the metrics and the number of years covered.

The journal Impact Factor is the average number of times articles from the journal published in the past two years have been cited in the JCR year.

The Impact Factor is calculated by dividing the number of citations in the JCR year by the total number of articles published in the two previous years. An Impact Factor of 4.4 means that, on average, the articles published one or two years ago have been cited 4.4 times.

Table 1: Comparing Journal impact for Library and Information Science publications

Using data from T-R Essential Science Indicators,  which includes 11 years of data, the Chinese Academy of Science  has over 226,000 articles, with 2.4 million citations which is 10.6 cites per paper.   Over 3,000 of these are highly cited articles, in the top one percent for their fields. For comparison, the University of Tokyo has about 76,000 articles with 1.16 million citations which is 15.3 cites per paper. Of these, about 1,200 are highly cited.  This is size dependent data.

h-index

The h-index is usually associated with an individual author.  However, it may be applied in institutional rankings.  The h-index is based on a list of publications ranked in descending order by the times cited count from a particular source.   For the years 2009-2014, the h-Index for University of Tokyo, based on 1,898 publications in the area of neuroscience, is 42.  For the same time and subject, the h-Index for the National University of Singapore, based on 656 papers, is 28.

Some researchers may have one great article, while most articles are not cited or hardly cited.  The researcher below has an h-Index of five.  The fifth highest cited article is cited 5 times.

Table 2: Calculating the h-index

CONCLUSION

In 2010, four authors write an article about bibliometrics.  They are from different institutions in different countries. The article has been cited 20 times in a journal that is in the lowest quartile of the LIS category.  The LIS category has fewer citations in general.  Author one’s institution has under 1,000 faculty while author two’s institution has over 2,000 faculty.

The same institution can be number one or not even in the top ten depending on whether rankers count this article once or four times, prorate it among institutions, calculate it on a per faculty basis, give it a bonus for being in a lower-cited field but penalize it for being in a lower quality journal and what other metrics are included.   When evaluating rankings, consider the metrics that are important to you and your institution.

References

Pagell, R. A. 2014. Bibliometrics and University Research Rankings Demystified for Librarians. Chen, C. and Larsen, R. (eds.) Library and Information Science: Trends and Research.(Open Access )  Bibliometrics and University Research Rankings Demystified for Librarians – Springer 

Hood, W.W. and Wilson, C.S. 2001.  The Literature of bibliometrics, scientometrics and informetrics.  Scientometrics, Vol. 52, No. 2, pp. 291-314.

Pritchard, A. 1969. Statistical bibliography or Bibliometric? Journal of Documentation, Vol. 25, pp. 348-349.

Ruth’s Rankings

  1. Introduction: Unwinding the Web of International Research Rankings
  2. A Brief History of Rankings and Higher Education Policy
  3. Bibliometrics: What We Count and How We Count
  4. The Big Two: Thomson Reuters and Scopus
  5. Comparing Times Higher Education (THE) and QS Rankings
  6. Scholarly Rankings from the Asian Perspective 
  7. Asian Institutions Grow in Nature
  8. Something for Everyone
  9. Expanding the Measurement of Science: From Citations to Web Visibility to Tweets
  10. Do-It-Yourself Rankings with InCites 
  11. U S News & World Report Goes Global
  12. U-Multirank: Is it for “U”?
  13. A Look Back Before We Move Forward
  14. SciVal – Elsevier’s research intelligence –  Mastering your metrics
  15. Analyzing 2015-2016 Updated Rankings and Introducing New Metrics
  16. The much maligned Journal Impact Factor
  17. Wikipedia and Google Scholar as Sources for University Rankings – Influence and popularity and open bibliometrics
  18. Rankings from Down Under – Australia and New Zealand
  19. Rankings from Down Under Part 2: Drilling Down to Australian and New Zealand Subject Categories
  20. World Class Universities and the New Flagship University: Reaching for the Rankings or Remodeling for Relevance
  21. Flagship Universities in Asia: From Bibliometrics to Econometrics and Social Indicators
  22. Indian University Rankings – The Good the Bad and the Inconsistent
  23. Are Global Higher Education Rankings Flawed or Misunderstood?  A Personal Critique
  24. Malaysia Higher Education – “Soaring Upward” or Not?
  25. THE Young University Rankings 2017 – Generational rankings and tips for success
  26. March Madness –The rankings of U.S universities and their sports
  27. Reputation, Rankings and Reality: Times Higher Education rolls out 2017 Reputation Rankings
  28. Japanese Universities:  Is the sun setting on Japanese higher education?
  29. From Bibliometrics to Geopolitics:  An Overview of Global Rankings and the Geopolitics of Higher Education edited by Ellen Hazelkorn 

1Ruth A .Pagell is currently teaching in the Library and Information Science Program at the University of Hawaii.   Before joining UH, she was the founding librarian of the Li Ka Shing Library at Singapore Management University.  She has written and spoken extensively on various aspects of librarianship, including contributing articles to ACCESS.