Track and Trace – Why we must maintain a laser-like focus when it comes to processing and protecting data

Authors / contacts: Karim Derrick


The use of technology and artificial intelligence (AI) has been thrown into even sharper focus since the advent of Covid-19 as governments across the globe seek to utilise all tools available to fight the pandemic

Yet in this understandable rush for solutions in what is clearly a life and death matter, there remain real concerns about data protection and civil liberties.

Track and Trace apps are effectively surveillance apps, tracking users in order to track diseases. But how do we balance privacy with our desire to be free of infection? The NHS is unique in the world for many reasons that includes its dataset: but again how do we balance patient privacy with our national desire to be free of infection? As the immediate threat of Covid-19 retreats we explore these competing priorities amid increasing scrutiny.

A trial of the NHS contact-tracing app began on the Isle of Wight on 5 May, to great fanfare, with plans to roll-out the tool across the rest of the UK at the beginning of June. Manual tracing was meant to be launched in England in mid-May, in tandem with an app that would automatically alert users if they have been in contact with another user with coronavirus. The manual element launched in England on 28 May and it appears the app is still being trialled on the Isle of Wight.

Overall, the sense is that the government has yet to leverage the huge potential for technology to automate at least part of the process. And epidemiologists say 60% of the population would need to use the app for it to be successful.

Contact tracing apps work by automatically alerting people via text as to whether they are at risk of having contracted the virus and are based on whether someone else they have been close to has been diagnosed with it. The manual alternative is to expect people who have tested positive for the coronavirus to remember who they have been in direct contact with in the 48 hours before symptoms started. For many, such proximity will have been with people who are unknown to them, meaning the manual method will inevitably be flawed.

The one third of users who it is claimed have not been traced using the manual method presumably exclude all the people unknown to the individual who had direct contact with them, making the number of those with Covid-19 who have not been traced much greater than 1/3.

Data model

In order to develop the app, NHSX originally opted to use a centralised data model. Using Bluetooth the app works continuously in the background of a person’s phone, storing anonymised identifiers collected from other devices a user comes into contact with. That information is stored within the app until a person develops potential Covid-19 symptoms.

This means that the matching process, which identifies which phones to send alerts to, takes place via a server. That means all of the data about your movements 24/7 being captured centrally. Whilst this provides government with an aggregated view of infections, it does so by arguably realising an unprecedented level of surveillance.

The original NHSX approach is in direct contrast with the decentralised approach recommended by Apple and Google which the government is now choosing to adopt. A decentralised approach sees the matches take place via user handsets without the need for a server and without the need for storing our movements centrally. Both Apple and Google argue that their method provides better privacy for users because it limits the ability of a hacker or any state authority to track users via a server.

Apple have previously suggested that they will not publish the app so long as it continues to store data centrally.

Interestingly, the German government also recently performed a U-turn on centralisation in favour of “strongly decentralised approach.” And according to the BBC, the UK’s original approach is also directly at odds with Switzerland, Estonia and Austria’s Red Cross, as well as a pan-European group called DP3T, which are pursuing decentralised designs.

Big brother fears

Of course this issue feeds into concerns about whether the government is playing ‘fast and loose’ with our data and whether Covid-19 is being used as an excuse not to take proper data precautions.

Yet there are examples of NHS data mining that both process and protect data.

Unprecedented research published at the beginning of May showed how coronavirus discriminates in its deadliness between different sections of the public providing a level of detail that is simply not possible anywhere else in the world.

The research was carried out by studying the medical records of 17 million individuals who were registered with GPs in England, along with 5,683 Covid-attributable deaths. The findings, amongst other things, pointed to the impact of the virus on people by sex, socio-economic and racial groups. Yet it wasn’t the results that were perhaps the most interesting part of this research, it was how the results were obtained.

The team of epidemiologists and data scientists, led by Ben Goldacre, a clinician and data scientist at the University of Oxford, under the name the OpenSAFELY Collective, did not try to copy the medical records or move the records for processing. Instead, its coders wrote new software so that they could conduct their analysis within the existing data centre.

From idea to publication it took just 42 days.


It was made possible because the health secretary, Matt Hancock, gave the group COPI notices. These are Control of Patient Information notices which give people within the NHS access to process health data in connection with covid-19. Effectively, the OpenSAFELY team was acting on behalf of the NHS.




There can be little doubt that data holds the power to help manage and eliminate covid-19. However, it’s the unintended consequences of the data falling into the wrong hands that is the biggest concern.

The obvious clout of the group – independent academics with years of experience and immense credibility having worked in some of the world’s most trusted institutions– helped.

The nuts and bolts were handled by the Phoenix Partnership, a well-established British clinical software company, who are behind the patient-record storing system to help GPs (SystmOne). The fact that the company didn’t require their own copies of a patient’s data and left a log of every action they took made it all the easier to trust them. It was clear from the outset that their only gain was to get positive results and improve connectivity within the healthcare setting

The broad adoption of the OpenSAFELY methodology would have big implications. Electronic-health-records systems would cease to be mere stores of data, and would start to become active pieces of the infrastructure underpinning medical research, shifting with the needs of science. 

What the OpenSAFELY team has shown is that it is possible to get meaningful results without copying and/or centralising data and without asking anyone to trust them with a large, sensitive data set. It can be done.

There can be little doubt that data holds the power to help manage and eliminate covid-19. However, it’s the unintended consequences of the data falling into the wrong hands that is the biggest concern. There is widespread concern that the pandemic will prove to be a precursor for covert erosion of data privacy controls. Nevertheless, as has been shown, when it comes to data, process and protection are not mutually exclusive.  They are two sides of the same coin and in a rush to protect our health, we must all maintain a laser-like focus when it comes to processing and protecting data.

Related news and insights