Guidance on using bibliometrics

Whatever purpose you are using them for, it is always important to consider the inherent limitations in citation indicators - it is easy to produce misleading figures without intending to.

The sections below outline some of the key issues that affect the use of bibliometrics. For more information, we recommend you consult the Bibliometrics at UCL guidance, which goes into more detail on these issues and offers specific guidance on using metrics in a number of different contexts.

Disciplines: The usual sources of bibliometric data are focused on journal articles. They may therefore be less useful in disciplines that are less reliant on journal publishing, such as the arts, humanities, social sciences, computing science and engineering; those disciplines which do concentrate on journal articles may still have significantly different publication and citation practices. Some indicators can be normalised to allow comparisons across disciplines, but this is only ever approximate, and may be challenging for multidisciplinary work. Different disciplines also have different approaches to collaboration and multiple authorship; it may be appropriate to apply fractional counting to multi-authored papers.
Be alert to the mix of publications in your data, be careful when drawing comparisons between different disciplines, be aware that useful data will not always be available for all disciplines, and use normalised indicators where possible.
Publication type: Review articles will generally get more citations than original research. It is generally recommended to avoid including them in citation-oriented bibliometric analyses. When calculating citation figures for a researcher, try to only consider their research papers, and filter out material such as editorials and book reviews, as these will skew the figures. For some research output, such as practice-based material or policy documents, it is unlikely that any citation analyses would be productive.
Different meanings of citations: Citations do not always signify the same thing. A paper may be cited as the fundamental basis of a later work, or as a cursory reference in a long list of loosely related prior research; either of these will show up as a single citation in a database and be counted in the same way with the same perceived value. Negative citations also exist - a particularly contentious paper may be frequently cited in order to dismiss its findings, or merely to remark upon the controversy.
It is not usually possible to avoid counting negative citations, and in some fields they may be more frequent than others. Be aware of their existence, and look critically at any strange outliers.
Number of authors: Where a document has multiple authors, most databases and indexes will attribute a publication and associated citation data to all authors. This means a single publication will count once for each author regardless of whether they were the sole author or somewhere in a long list of authors; and once for each author's institution regardless of whether the institution contributed one or more authors. Many existing metrics do not take account of shared contributions, and this can give misleading results particularly in highly collaborative fields.
Methods exist to weight contributions, such as fractional weighting or author ordering. However, weighting relies on a clear sense of what any individual has contributed, and this information is rarely available. Without a fixed standard, conventions vary substantially between fields or journals, and weighting for shared authorship must be interpreted cautiously.
Time since publication: Citations continually accrue over time, so older papers will tend to have more citations than newer ones. Similarly, older researchers usually have more papers and have had more time to accrue citations to them, so an indicator like the h-index will be misleading for researchers at different stages of their career.

Where possible, compare papers against those of a comparable age, or using a fixed time period after publication.
Author background: The personal background of the author can affect both their publication rates and their citation rates. Full-time researchers will tend to produce more papers than those on part-time contracts, those who have recently had career breaks, or those with substantial non-research responsibilities (teaching, administration, etc). There is a known "Matthew effect" - the rich get richer - meaning that more citations tend to be accrued by more prominent, well-established, researchers. Individual factors may also lead to existing large-scale systemic biases being reflected in (or even reinforced by) the citation indicators.
These are hard to normalise for, and as such indicators about individual productivity should be treated with care.
Data sources: Different citation databases will give different results, because they index a different range of material in different ways. Some have better coverage of monographs, reports, and conference proceedings than others; some omit specific journals. Some have better coverage of non-English language sources than others.
Comparisons should always be based on the same data source, and where possible using data gathered at the same time. Always be cautious of benchmarking against citation indicators from an unclear source.
New and alternative metrics: There has been a sharp growth in recent years in various commercial "altmetrics" services. These often use similar source data (eg number of tweets or download figures) but interpreted and presented in different ways. Depending on what indicators are used, they can show scholarly interest (eg Mendeley bookmarking), media interest (eg news stories), or public interest (eg social media activity). They can also be used to identify the use of research in policy documents or other official publications which may not appear in the conventional citation databases. Spikes in activity may come if a piece of work is particularly contentious, timely, or simply on a topic that catches the public imagination. It is harder to gather standardised and comprehensive new metrics than it is traditional citation data.
In general, it is best to treat figures from these metrics as broad indicators - high activity tells us that there is something interesting there, but the details should be examined before drawing conclusions. They should never be used to quote a single numeric "score" for ranking a paper or author.

Guidance on using bibliometrics

Contact us