How Fast is My Supercomputer?
Application benchmarking is a crucial activity in the UK’s path to Exascale, ensuring future systems are understood so that scientific applications can take advantage of the new opportunities offered.
10 November 2022
Background
Benchmarking commonly requires manual work and relies on knowledge possessed by a few individuals. It is vital to rigorously measure the performance of benchmarks in a systematic way to enhance transparency and enable reproducibility. UCL ARC is leading this collaboration of UK Universities to provide the tooling to automate the collection and analysis of benchmark data. It provides a suite of benchmarks representative of UK exascale science and a framework for configuring and running those benchmarks in diverse architectures, and post-processing the results.
Benchmarking can mean different things: You may have bought a new computer and you want to find out how fast it is. You will run something (a benchmark) whose performance you know on other computers, and compare the results. Or you might have made an improvement to the software you are developing, and want to know how much it has improved. You will run a benchmark with the improved software, and compare the result to the previous version.
The problem
Benchmarking, especially the kind where we are testing new computer hardware, often relies on manual labour and hidden knowledge about build systems and run configurations, because they tend to get intertwined during application development. Often, running a benchmark depends on having access to the person who knows how to build it, run it, and analyse the results. This makes comparison of the performance of new HPC systems particularly difficult, because for each benchmark to be run (and usually you want to run many), the person will have to learn the workings of the new system. The ExCALIBUR program is setting up multiple testbeds of experimental hardware that need to be benchmarked transparently and reproducibly, with a set of benchmarks that represents the scientific workload of an exascale computer.
What we did
We want to treat benchmarking as research software in itself. We have developed a benchmarking framework with the aim of separation of concerns between the scientific functionality and the low-level implementation of a benchmark. The aim is to configure the framework on a new system once, and be able to run any benchmark within the framework with the press of a button. The build systems of scientific software can be complex, and this is not an easy task. To manage it we are using the spack package manager to manage build dependencies and automate builds, and the ReFrame framework to abstract the low-level interactions with the HPC system. Both are powerful tools that are state of the art in HPC centres worldwide.
The outcomes
We have deployed the benchmarking framework on many HPC systems in the UK that are being used by the research community, and included several popular scientific application benchmarks. To make the framework better known in the community, we have started a multi-institution collaboration within the UK to raise awareness of the impact of benchmarking on the success of ExCALIBUR projects, deploy the benchmarking framework on relevant UK HPC hardware, publish benchmarking results for a selection of applications, foster benchmark contributions to the repository by ExCALIBUR projects, extend the benchmarking framework with new Performance Portability metrics and develop representative linear solver benchmarks. The collaboration is funded to continue well into 2023, when final results will be reported.
This has been a tremedous success.” - Dr Jeremy Yates, Innovation & Technology Director, DiRAC (4 April 2022)
Links
- ExCALIBUR program
- Benchmarking framework on GitHub
- The UCL Adaptable Cluster Project (ExCALIBUR Interconnect Demonstrator)
- DiRAC