Ganesh Mani: Data is the best vaccine


Data has been compared to crude oil, as it has been fueling our information economy over the last several decades. Today, it appears that data can serve as an essential oil or even a vaccine, as we nurse our society and economy back to health.

We seem to be fighting time and tribalism, along with an invisible, tricky virus. The arsenal of tanks, fast-attack submarines and F-35 fighter jets the U.S. amassed in anticipation of a kinetic war is powerless in this bio battle.

What is, however, showing promise is data — from the front lines of the battles in many global cities — and its creative, protective uses. Epidemiologists are studying transmission rates, incubation periods and infection density. Clinicians are eager to be informed by data from treatment protocols, such as ventilator settings and patient positioning, alongside their outcomes. Or summary statistics from observational studies — in lieu of double-blind, randomized trials — regarding older, previously approved drugs. Policymakers want data relating to nonpharmaceutical interventions (e.g., the duration of school closings) in other cities to help guide them.

How can we use the data vaccine and why is it important?

As we enter the next phase of the battle, testing — both for the virus and antibodies — is finally becoming more widely available. Until individuals can access them easily, aggregate community tests may be possible by testing wastewater for presence and concentration changes over time of the virus.

Along with at-home saliva tests, the quick-screen antigen tests (that may need to be followed up by a more reliable test), antibody tests and self-reported survey data from Facebook and Google, a reasonable picture of disease prevalence in the community is now possible. Cellphone data can estimate foot traffic at commercial establishments while aggregate credit card data can take the pulse of economic activity. Using all of this data, it will be possible to control subsequent infection waves in a community.

How do we nudge asymptomatic folk to get tested? Post-testing, how do we get virus-positive individuals to accurately remember and report their contacts at a very stressful time in their lives? Financial and care-related incentives can perhaps help elicit reliable data from individuals. All jurisdictions must be urged and incentivized to report standardized data, including demographics, so that they can be aggregated to get a regional or national picture. Models need to ingest accurate, normalized data to make trustworthy predictions.

Good data can also help protect us from the related infodemic — overabundance of information, much of it questionable or patently false, but circulated for fear-mongering or other nefarious purposes.

The holy grail is a safe and effective biological vaccine. But, as we wait, the data vaccine can also help in planning the testing, manufacture and distribution of multiple vaccine candidates that are expected to succeed. While the biological vaccine acts on cells, the widely applicable data vaccine can be used to inform the “atoms,” “bits” and “cells” world, and is arguably more versatile and less risky to use.

We missed the opportunity globally to deploy a unified, allied army in this war; however, region specific data can still be artfully utilized.

Are there risks in using the data vaccine?

If data is interpreted and used correctly, the risks are minimal. However, care must be taken to understand the “data supply chain” — the selection, collection, organization, transport, storage and flow of data, keeping in mind its provenance and vintage.

Monitoring of social distancing, employee behavior and contact tracing of virus-positive individuals raises the specter of an Orwellian society. If a smartphone app is used to monitor and collect data, it is helpful if everybody is on the same app or that the multiple apps in use are interoperable and permit data sharing. Incentives as well as guarantees of sunsetting data — after its intended use — may be key to enhance data collection.

Technology exists to make data anonymous to humans but identifiable to downstream bots for a specific purpose (e.g., to notify the smartphones of people who may have come in contact with a person who just tested positive).

What is the role of Western Pennsylvania?

We are home to the land of Jonas Salk, who pioneered the polio vaccine. People outside the region marvel at our transformation from Steel City to an “Eds and Meds” nucleus. We have some of the world’s leading clinicians plus data and artificial intelligence experts in the region. It is an opportunity to set an example for the rest of the world on how the data vaccine, which is here now, can be utilized. It can also set the stage for how our region’s version of the biological vaccine, PittCoVacc, is tested and hopefully widely deployed.

Let us inoculate ourselves against the belief that we have to choose between economic prosperity and the health of individuals. Yes, there are trade-offs; policies informed by data can naturally guide these trade-offs, with the realization that these decisions are not one-time and static. As new data ebbs and flows, so must best practices, throttling vigilance, as required.

Ganesh Mani is a data science expert, adjunct faculty member at Carnegie Mellon University and former president of the local chapter of The IndUS Entrepreneurs.