UCL Institute for Sustainable Heritage


GLAMCoW (Galleries, Libraries, Archives, & Museums Collections on Wikipedia) Toolkit

The GLAMCoW toolkit helps inform decision-making around open access collections within Galleries, Libraries, Archives, & Museums.

Painting by Hugh Hughes: Cattle in mountain landscape

16 September 2022

The GLAMCoW (Galleries, Libraries, Archives, & Museums Collections on Wikipedia) Toolkit was developed by the UCL Institute for Sustainable Heritage, in partnership with the National Library of Wales and UCL Special Collections. The main aim is to provide a tool that helps inform decision-making around open access collections within GLAM institutions.

Recent years have seen a general move in society towards open access: cultural institutions have increasingly needed to devote resources to digitisation and digital asset management. Many benefits have been identified from this move to open access, including the creation of new fundraising opportunities and the connection to a much wider audience. As part of these open access policies, many institutions are aligning or even developing close strategic partnerships with large open access platforms, such as Wikipedia, which can also further enhance the value of collections by connecting its items to linked open data.

However, there are also challenges. One of these is the lack of a consistent way of measuring the impact and benefits of these open access initiatives which is important because institutions are often asked to provide evidence of the efficacy of these projects. There is therefore a clear need to create tailored, quantitative metrics as well as a toolkit that enables the use within the sector. 

The tool built during the project is based on creating a quantitative metric of the engagement impact or potential impact that collection items have or could have when embedded in relevant Wikipedia entries, which the tool suggests. This metric is based on four components which are indicators of engagement impact - the page views of the Wikipedia entries, the relevance of the collection item within the Wikipedia entries, the completeness of the Wikipedia entries and the image uniqueness within the Wikipedia entries. To account for different objectives within different institutions, the components can be weighted differently.

The results of the impact metric can then be utilised by institutions to better consider tradeoffs between effort and potential engagement. In order to make the calculations accessible, the results can be created and analysed using the available toolkit.

This project, led by Dr Scott Allan Orr, received funding from Research England’s Higher Education Innovation Fund in 2022, which was administered through UCL Innovation and Enterprise.


The toolkit is made up of two distinct applications:

1)    GLAMCoW Data Collection - this first application takes as input either a collection which is already published on Wikimedia or a file containing metadata about a collection that is not yet on Wikimedia. It uses named-entity recognition and natural language processing to identify relevant Wikipedia entries for each collection item and calculates the components outlined above. The output of this application is a downloadable .csv file which contains the results of the relevant Wikipedia entries and the components for each collection item. This data can be analysed on a tool such as Microsoft Excel or using the second application of the toolkit.

Instructions on how to install the toolkit and detailed usage instructions can be found here.

2)    GLAMCoW Data Exploration - the second application takes the input from the Data Collection step as input and provides tailored ways of exploring the data. It handles three modes of analysis:
a)    Individual analysis - aimed at exploring the results of an individual collection item
b)    Aggregate analysis - aimed at exploring the aggregated results of the entire collection
c)    Effort analysis - provides a prioritisation-based effort analysis to identify tradeoffs of work allocation and potential impact return.

Instructions on how to install the toolkit and detailed usage instructions can be found here.


Image ABOVE: ‘Cattle in mountain landscape’ painted by Hugh Hughes (1790-1863) from the Framed Works of Art collection at the National Library of Wales