Open-source contact tracing, part 1
One of the responses to the COVID-19 pandemic consists of identifying contacts of infected people so they can be informed about the risk; that will allow them to search for medical care, if needed. This is laborious work if it is done manually, so a number of applications have been developed to help with contact tracing. But they are causing debates about their effectiveness and privacy impacts. Many of the applications were released under open-source licenses. Here, we look at the principles of these applications and the software frameworks used to build them; part two will look into some applications in more detail, along with the controversies (especially related to privacy) around these tools.
COVID-19 tracing principles
The main goal of COVID-19 tracing applications is to notify users if they have been recently in contact with an infected person, so that they can isolate themselves or seek out testing. The creation of the applications is usually supported by governments, with the development performed by health authorities and research institutions. The Wikipedia page for COVID-19 apps lists (as of early June 2020) at least 38 countries with such applications in use or under development, and at least eight framework initiatives.
The applications trace the people that the user has had contact with for a significant period (for example, 15 minutes) with close physical proximity (a distance around one meter). The complete tracing system usually consists of an application for mobile phones and the server software.
For the distance measurement and detecting the presence of other users, GPS and Bluetooth are the technical solutions used in practice. GPS only appears in a small number of projects because it does not have enough precision, especially inside buildings. It also does not work in enclosed spaces like underground parking and subways.
Most countries have chosen to develop a distance measurement using Bluetooth, generally the Bluetooth Low Energy (BLE) variant, which uses less energy than the classical version. This is important as the distance measurement is done by mobile phones, and so Bluetooth will need to be active most of the time.
The Bluetooth protocol was not designed for these kinds of tasks, though, so research has been done on ways to measure distance accurately. A report [PDF] from the Pan-European Privacy-Preserving Proximity Tracing project shows that it is possible to measure distance using BLE signal strength, specifically received signal strength indication (RSSI). In a contact-tracing system using Bluetooth, the distance measurement is made by the two phones communicating using a specific message format. Since the formats differ between applications, communication is only guaranteed to work if both phones are using the same application.
Centralized versus decentralized
Storing and communicating contacts is the main functionality of COVID-19 tracing applications. Two main approaches exist: centralized and decentralized, while applications may mix ideas from both models.
To understand the difference between those two types of systems we need to take a look on how user identification works. Each user obtains a random global ID number either from the central authority in the centralized approach or as generated by the application for decentralized systems. Since this is the global identification for the user, it reveals their identity. To preserve privacy, this global ID is never exchanged with peers (i.e. other phones) when registering the encounters, though it may be known by the server. Instead, the global ID is used as a seed to generate temporary IDs using a cryptographic hash function (like SHA-256), or an HMAC, taking as an input the global ID and a changing value, like an increasing number or an identification of a time slot. Temporary IDs change frequently, for example every 15 minutes, and they are broadcast for the other users to register.
Centralized systems use a single server (usually controlled by the health authorities), which generates and stores the global IDs of users. When a user is infected, their contact log is uploaded to the health authorities. People who have been in contact then get notified. The technical solutions vary, from manual operation to one that is automated in the application. That process is handled by the health authorities; the user application just receives a result.
Decentralized systems rely on the user's phone to generate both the global and temporary IDs. In those systems, the global ID may also change periodically. When a user is infected, they should upload their temporary IDs, or the information needed to generate them, to a common server. Other users download the shared infection data and their applications search for a contact. This paper [PDF] provides details of a few different decentralized protocols.
The main difference between the two approaches is in who generates the IDs, whether the central server knows them and the identities associated with them, and who calculates the infection risk. Both solutions need a central server to exchange lists of IDs for infected people.
Frameworks
Development of a tracing system first requires a contact-tracing protocol and then an application. Applications are typically developed by the government agencies, and they use one of the existing frameworks (protocols) or create a new one. A number of such frameworks were developed, most of them have at least part of their code released as open source. Here we look at some of them that are, or have been, used in deployed applications.
Temporary Contact Numbers Protocol
The first framework released was the Temporary Contact Numbers Protocol (TCN), which was initially developed by Covid Watch, then maintained by the TCN Coalition. The source code for this decentralized framework, including the protocol and reference implementation, is available under the Apache software license.
Devices using TCN broadcast randomly generated, temporary contact IDs; at the same time, the devices record all of the received ones. The Covid Watch white paper [PDF] gives an overview of the protocol. The actual implementation uses numbers derived from periodically changed keys (the TCN project README provides the cryptographic details), to minimize the data set that needs to be sent when a person is infected. All of the keys that are generated by the user's device are sent to the central server only if the user gets infected.
The TCN framework allows for variations in the implementation; for example, whether or not a central health authority needs to verify an infection report. TCN is (or was) used in a number of tracing applications, including the US-based CoEpi and German ito.
Decentralized Privacy-Preserving Proximity Tracing
Decentralized Privacy-Preserving Proximity Tracing (DP-3T or DP3T) is a decentralized protocol, similar to TCN. It was developed by a number of European research institutions. Its white paper [PDF] describes the algorithm in detail and gives an overview of the security challenges.
The generation of seed values and temporary identifications is performed by the phone, which also computes the risk of infection. The phone only downloads the parameters needed to determine the infection risk (e.g. duration of contact, signal strength) from the health authorities. DP-3T includes a set of additional features to increase privacy. All phones running the application upload dummy contact data to minimize the risk of revealing infected users. It also has an option to allow the infected users to split and edit the report, for example to exclude certain days or time periods.
DP-3T source code is available under the Mozilla Public License 2.0 and the project includes a work-in-progress implementation using the Exposure Notification API.
Exposure Notification API
The Exposure Notification framework was released in April 2020 by Google and Apple. It seems to be inspired by TCN and DP-3T, and has many similarities with them. It uses the same type of periodically changing keys (the cryptographic specification [PDF] gives the details).
The protocol that was part of the application in TCN, DP-3T, or other frameworks, is implemented in Android and iOS, then provided as the Exposure Notification API [PDF]. It includes proximity detection and logging of the encountered keys, but not the notification of an infection; that part needs to be implemented in the application itself. The Exposure Notification API can be used only from applications provided by public health authorities. The specifications are available, but the source code of the implementation is not. The Google terms [PDF] include some specific requirements for the applications, including: only one application per country, that a public health authority must be responsible for all of the data stored, and that no advertising is allowed in the application.
A reference application for Android and an example server implementation are both available as source code under the Apache license. Since the release of the framework, applications that were ported to it include the Italian Immuni (source code under AGPL 3.0) and the Polish ProteGo Safe (source code under GPL 3.0). Another example is Covid Watch, which was one of the original supporters of TCN, but its application replaced TCN with the Exposure Notification framework in May 2020.
The Exposure Notification API solves one problem that many independent applications have encountered (the BlueTrace paper [PDF] describes the problem on page 7): on iOS, Bluetooth location measurement only works if the application is in the foreground. Since the French application does not use the API, the French government has asked Apple to allow background Bluetooth for other applications.
Applications and beyond
In this article, we explained the purpose of contact-tracing applications, the technology they use, and described the reasons they work this way. In the second article, which is coming soon, we will look deeper into some specific applications (that use existing frameworks or develop their own protocols). We will look at how they work, but also cover their open-source availability. Finally, we will consider the controversies and open questions about the deployment of these applications.
| Index entries for this article | |
|---|---|
| GuestArticles | Rybczynska, Marta |
