By K.T. Lam
(* First published in ACCESS: Asia’s Newspaper on Electronic Information Products & Services, No.82, September 2012)
The Hong Kong University of Science and Technology (HKUST) has recently launched the Scholarly Publications Portal (http://spfind.ust.hk), providing a one-stop platform for researchers and information seekers to discover the wealth of research output produced by HKUST researchers.
The portal is the result of a joint project initiated between the Library and the Office of the Vice-President of Research and Graduate Studies, on a Hong Kong Government funding grant for knowledge transfer and harvesting. The goals of the project are to create a complete index of publications authored by HKUST researchers and to establish Author Profiles to integrate researchers’ publications, bibliometrics (e.g. citation counts and h-indexes) and research interests into one single interface.
As of July 2012, over 47,000 publications by HKUST researchers were indexed, including journal articles (59 percent), conference papers (37 percent), book chapters (3 percent) and books (1 percent). For current faculty, their publications produced before joining the University are also covered in the Index. About 14 percent of the publications are published in the 2010s and 9 percent are pre-1990.
Over 450 Author Profiles of current faculty members were released to the public in June 2012 after a two month review during which each profile owner was given an opportunity to preview their profile, report discrepancies and submit missing publications for inclusion.
Creating the Index
One of the earliest challenges in creating this index was acquiring the lists of publications by HKUST researchers. One adopted method was to manually search Web of Science and Scopus for records with authors affiliated with HKUST. This helped for obtaining the lists of publications produced after the researchers joined the University.
The second method was to solicit help from the University’s Office of Contract and Grant Administration to gain access to their database. The Library was able to enrich the Index with their data which contain publications supported by the University’s research grants between 2001 and 2010. Beginning 2011, the University established the Faculty Online Reporting System (FORS), in which faculty members are required to submit their lists of publications published during the year. With FORS, the Library is able to obtain on-going publication lists for the Index.
To enable the creation of Author Profiles, the Library also needed to obtain publications of researchers before they joined the University. The Library was able to obtain two thirds of the curricula vitae (CV) of current faculty members which were submitted to FORS in the 2011 submission. With these CVs in place, the Library loaded pre-affiliated publication data to the Index.
Searching Publications
The Scholarly Publications portal is openly accessible on the internet. Users can search by keywords in Author, Title, Subject, Summary and ISBN/ISSN. They can also browse publications by Author and Subject. Search results can be filtered by facets on Format, Author, Subject, Journal and Publication period. The results list can also be exported as email and CSV file, or directly transferred to RefWorks.
Rich features are available in the publication record (see Figure 1), including the display of times-cited counts from Web of Science and Scopus, links to view full text, the availability of author profiles, as well as the suggestion of related publications.
Author Profiles
The process of designing and creating the Author Profile interface began in May 2011. The target was to attach all publication entries belonging to an author to a profile page specifically created for this author.
By attaching publication data to the author profile, added-value analysis about the author’s publication can be conducted. Under the Publication Tab (see Figure 2), you can limit the publications by document type, examine what was published before or after the author joined the University, view what was published in a range of years, and finally sort and export a portion or all of the publication data to email, Excel and RefWorks.
Another analysis possibility is the construction of the Bibliometrics Tab (see Figure 3), in which the author’s overall citation counts and h-indexes in different databases can be compiled or linked to. Thanks to web service tools provided by Scopus and Web of Science, the Library is able to match the publications under each author profile against these two databases and calculate the corresponding times-cited counts and h-indexes. You are also able to click on links to view the author’s bibliometrics and publication details reported in Scopus, Google Scholar and Researcher ID.
The Research Interests Tab displays the author’s areas of research. It is currently quite limited and future work is needed to expand its functionality. More analysis could also be added to the Author Profile in the future; options include patents, esteem measures and research projects.
De-duplication and Name Disambiguation
Two other technical issues surfaced very quickly in the course of the project. They were publication data de-duplication and author name disambiguation.
As the Index was created from multiple publication lists, an effective mechanism for record de-duplication was essential. The name of an author may appear differently in the author’s various publications. In addition, publishers usually abbreviate or re-format author names in their publications. Name variation was a headache for author searching as well as for creating author profiles. The Library derived algorithms and developed programs to address these issues. With a computer-aided workflow, these problems were successfully resolved.
System Design and Metadata
As with many of the HKUST Library’s digital library projects, the Scholarly Publications system was developed based on open-source software, including DSpace and VuFind. With its rich metadata features, DSpace was adopted to host the publication metadata. These metadata are mirrored to VuFind, which is also open-source software, but is designed for library catalogues, featuring the Solr search engine and facet-based browsing. The Library has made substantial customization on VuFind in order to model it as the portal of Scholarly Publications.
Publication metadata is stored in DSpace using Dublin Core metadata schema while in VuFind, it is stored in MARC format. The Library developed a mapping table between the two schemas so that metadata can be mirrored from DSpace to VuFind.
Sustainability
Like many other databases, on-going maintenance and updating of the Index and Author Profiles are needed. The Library has developed computer tools for batch loading publication data, refreshing the bibliometrics, identifying duplicated records and performing name disambiguation. As many of these tasks still require human intervention, provision of on-going staffing and student helpers for up-keeping the database is a must for long term sustainability.
How about HKUST Institutional Repository?
HKUST Library was the very first adopter of the concept of open access and institutional repository (IR) in Asia. It launched its own IR (http://repository.ust.hk) using DSpace in early 2003, creating a repository of openly accessible documents produced by HKUST researchers, and covering not just publications but also theses, presentations and working papers. The growth in size of the IR is slow (currently 7,000 records). This is because HKUST’s IR is not an index, and indeed a record will only be created if, and only if, the document is allowed for self-archiving. In addition, many documents deposited are pre-published versions. Because of these restrictions, the IR does not fully reflect the total scholarly output produced by HKUST researchers.
While documents in the HKUST IR are still heavily used on the internet (over 900,000 downloads since October 2004; 28,000 times in the month of June 2012), the Library saw the need to expand its scholarly publications support. This joint project with the University’s research office gave the Library the opportunity to resolve the IR limitations.
Conclusions
The Scholarly Publications database supplements the HKUST IR by providing a one-stop portal for a complete listing of scholarly output by HKUST researchers. It increases the visibility of faculty members’ publications and their expertise by exposing them to internet search engines. Together with the features in Author Profiles, such as bibliometrics and research interests, it provides a readily available infrastructure for the creation of added-value services, to be used by the University’s knowledge analysis and transfer projects. For more information and enquires about this portal, send an email to [email protected]. (The author, K.T. Lam, is Associate University Librarian, HKUST Library)
Figure 1 – Record display showing times-cited counts, links to view full-text, available author profiles and related publications