‘Excel cannot complete this task’:
Drawing parallels between UK Test and Trace and health economic models
Chief Scientific Officer, BresMed
Senior Health Economist, BresMed
Senior Health Economist, BresMed
Senior Statistician, BresMed
You will have likely seen recently that inappropriate use of Microsoft Excel® has led to a major issue with the UK’s Test and Trace system, with the reporting of almost 16,000 positive COVID-19 cases being delayed for over a week due to CSV row number limitations.1 The heart of the issue is the use of Excel to transfer data between one system and another – and, more generally, continuing to use outdated software versions and antiquated methods to transfer data, rather than creating systems that truly talk to each other and operate within the same framework (i.e. that are capable of record storage as well as the powerful analytical analysis required to perform country-wide pandemic modelling).
The more we continue to stretch Excel’s capabilities, the more we will see the kinds of errors we have seen with Test and Trace.
The learnings from this apply equally to health economic models. The continued use of Excel for health economic modelling is understandable given its current adequacy. Yet this is not a sustainable paradigm in the long term, with the ever-increasing need for complexity in statistical approaches and the modelling methods to accommodate them. The problems that arise from the ‘overuse’ of Excel are well known by this point; research suggests that 50% of operational spreadsheets used within large businesses contain some sort of material error2, and the prevalence of errors within spreadsheets is widespread enough to warrant the existence of consultancies dedicated to resolving Excel issues.3 The more we continue to stretch Excel’s capabilities, the more we will see the kinds of errors we have seen with Test and Trace.
More rapid, more complex
A key example is that of the cost-effectiveness models used in health technology appraisal. These are typically built in Excel, as preferred by the pharmaceutical industry and, historically, by leading health technology assessment (HTA) bodies such as the National Institute for Health and Care Excellence (NICE) and the Scottish Medicines Consortium (SMC). However, the changing demands of HTA have called into question whether Excel is the best option, given that other software packages provide increased flexibility and the capacity to perform within-model statistical analysis.
Recently, more complex HTA submissions have explored numerous analytical ‘add-ons’, many of which involve conducting volumes of statistical analysis and then feeding this through to the economic analysis.4, 5 Some of these add-ons require extensive coding in Visual Basic® for Applications (VBA) to implement – thereby removing any benefit of using Excel. Examples of complex models include those that capture novel therapies lacking randomized, controlled evidence for clinical effectiveness, such as chimeric antigen receptor (CAR) T-cell therapies. Given the increasingly rapid HTA turnaround – fuelled by public pressure to get new treatments to market more quickly – the modelling platform in such complex cases needs to be fast, transparent, and flexible to changes in modelling assumptions or supporting datasets, all whilst retaining user-friendliness. The question is: does Excel still fulfil these criteria?
shiny yet see-through
Arguably the greatest barrier to the widespread use of non-Excel health economic models is the learning curve associated with alternative platforms, which requires time commitments to follow learning pathways and results in a perceived lack of transparency. One alternative to Excel for health economics and outcomes research (HEOR) modelling is the statistical programming language R. An open-source platform, R has a large and active community of statisticians and data scientists at the cutting edge of their fields. Indeed, R is already well integrated into the HEOR landscape due to its role as a statistical analysis software – for example, GlaxoSmithKline has recently announced that they are planning to move their statistical work to R.6 Because of this, a multitude of R software packages exists to perform the analyses required by HTA bodies. Many of these packages are published in the peer-reviewed Journal of Statistical Software (e.g. flexsurv7) and are subject to continual public scrutiny due to being open source.
We believe that R is well positioned to take over
as the de facto modelling software in HEOR
R has also been central to the movement towards reproducible and accessible reporting, with packages like officer, rmarkdown, shiny and bookdown able to interlace statistics, modelling and reporting to elevate the transparency and reproducibility of HEOR outcomes. We therefore believe that R is well positioned to take over as the de facto modelling software in HEOR – after all, NICE already accepts R as a standard software.8
There are concerns around the accessibility of models to non-technical users with little or no coding experience. To address these, cost-effectiveness modelling can be performed using statistical software like R alongside a user-friendly front-end for data visualization and interrogation, using the shiny package. R explicitly allows statistical analysis within a cost-effectiveness model, all in one place. This end-to-end functionality improves the transparency of the whole process, from patient data to cost-effectiveness estimates; everything is happening in one place, in sequence. And by setting up the model to be deployed as a user-interactive application that presents its own quality control test results, the user does not have to interact with the code itself to run the model, unless they want to – traditionally a significant barrier to adopting Excel alternatives.
Fit for purpose?
Excel’s limitations as a cost-effectiveness analysis software mean that for many contemporary decision problems involving large data inputs and complex statistical analysis, there is always a significant risk of error and inefficiency. In addition, many live, within-model statistical analyses – like propensity score matching or survival regressions – cannot be feasibly performed within Excel, due to the need for maximum likelihood estimation. Excel cannot offer solutions to these issues, which will only get worse with growing demand for complex analysis and within-model statistical analysis, increases in the size or frequency of datasets to be analysed, and tightening time limits within HTA.
The solutions that R offers go beyond avoiding the inconvenience and inefficiency of copying and pasting from statistical software into Excel. A well-designed HEOR model with integrated analysis facilitates almost instantaneous updates to results when making common changes to modelling scenarios, such as incorporating additional data-cuts, or changes in strategy involving complex analyses like network meta-analyses and propensity score matching between trials. While not a silver bullet, adopting R as the norm rather than the exception is one way for the HTA world to avoid the fundamental issues that arise from an over-reliance on Excel.9
The UK Test and Trace fiasco has made painfully apparent the risks of using antiquated systems to (mis)manage a country’s response to a global catastrophe. But its silver lining may come in the lessons that we can take from its analytical failings – notably, that an adequate but outdated solution is only ‘good enough’ until it isn’t.
1. Staines R. Excel spreadsheet blamed for UK’s COVID-19 test and trace woes. 2020. Available at: https://pharmaphorum.com/news/excel-spreadsheet-blamed-for-uks-covid-19-test-and-trace-woes/. Accessed: 16 October 2020.
3. Kobie N. Meet the Excel warriors saving the world from spreadsheet disaster. 2020. Available at: https://www.wired.co.uk/article/spreadsheet-excel-errors. Accessed: 21 October 2020.
4. National Institute for Health and Care Excellence (NICE). Axicabtagene ciloleucel for treating diffuse large B-cell lymphoma and primary mediastinal large B-cell lymphoma after 2 or more systemic therapies. Technology appraisal guidance [TA559]. 2019. Available at: https://www.nice.org.uk/guidance/ta559. Accessed: 20 October 2020.
5. National Institute for Health and Care Excellence (NICE). Upadacitinib for treating moderate to severe rheumatoid arthritis [ID1400]. In development [GID-TA10389]. 2018. (Updated: 5 May 2020) Available at: https://www.nice.org.uk/guidance/indevelopment/gid-ta10389. Accessed: 22 October 2020.
6. Nicholls A and Rimler M. R Studio Blog. Using R to drive agility in clinical reporting: Questions and answers. 2020. Available at: https://blog.rstudio.com/2020/10/08/driving-agility-in-clinical-reporting-q-a/. Accessed: 16 October 2020.
8. National Institute for Health and Care Excellence. Guide to the processes of technology appraisal. The appraisal process. 2014. (Updated: 30 May 2018) Available at: https://www.nice.org.uk/process/pmg19/chapter/the-appraisal-process. Accessed: 16 October 2020.