Governments, health-care professionals, and industries scrambling to address the Covid-19 pandemic have some powerful allies: big data and predictive analysis in combination with artificial intelligence and an arsenal of thermal sensors...
The governments, health-care professionals, and industries scrambling to address the Covid-19 pandemic have some powerful allies in the battle to minimize the toll on public health and the global economy: big data and predictive analysis in combination with artificial intelligence and an arsenal of thermal sensors.
Covid-19 belongs to the same family of viruses associated with severe acute respiratory syndrome (SARS) and the common cold. Because it is a novel virus to which humans had no prior immunity, its early impact has been devastating. Months after the first reports came in from China’s Hubei province, testing in most countries remained sporadic at best, leaving populations around the globe uncertain of the actual number of cases in their midst and uncertain how to respond to the danger or even understand its scope. It wasn’t long before experts in AI and data analysis techniques recognized the potential for AI technology and data science to support the work of epidemiologists and government crisis response teams.
Data analysis and mathematics, together with physics, enable an in-depth understanding of natural processes. Data science pioneers have already had an impact on public health, deploying data collection and analysis to help slow the spread of earlier outbreaks. One of the first historical applications of data analysis was in 1852, during a cholera outbreak in London. John Snow, one of the first data-driven epidemiologists, geospatially analyzed the deaths that occurred in London and thus was able to isolate the source of the disease. Relying on his analysis, authorities were able to target their interventions and rapidly check the outbreak’s spread.
By running models through data analysis systems, researchers are able to approximate how trends might progress. An example is the SIR model, an epidemiological model that computes the theoretical number of people infected with a contagious illness in a closed population over time. The model uses coupled equations analyzing the number of susceptible people, S(t); number of people infected, I(t); and number of people who have recovered, R(t).
One of the simplest SIR models is the Kermack-McKendrick model, the foundation upon which many other compartmental models were based. In this context, I found an analysis1 published in early March by Ettore Mariotti, a graduate research fellow at Università degli Studi di Padova, to be quite interesting.
Consider an island — our system — that people can neither leave nor enter. Every individual on the island can be in one of the following states at a given time: “Susceptible,” “Infected,” and “Recovered” (hence, the acronym SIR). With a certain probability, people who have never had the disease (S) can become ill and infected (I) for a certain period before they recover (R). In the case of Covid-19, it is appropriate to extend the model with an additional state, “Exposed,” to include people who have the virus but are not yet infectious (SEIR model; Figure 1).
This model considers two factors: the dynamics of the virus and the interaction of individuals. (The latter is very complex and benefits from the tools described here.) With this information in hand, it is possible to define the R0 parameter, which represents the number of people whom an infected person can potentially infect.
Let’s suppose, for example, that Person A is sick and that our system has an R0 = 2, meaning that A will infect two people. Those two people will, in turn, infect four people, who will infect another two people each (so 4 × 2 = 8) and so on. This highlights the fact that the spread of the disease is multiplicative rather than additive. R0 can capture three basic scenarios (Figure 2).
The closure of schools, gyms, theaters, restaurants, and other public venues decreases the number of social interactions, thus lowering R0. Because the virus has strained public health resources to the breaking point, reducing the R0 parameter below unity has been critical. If R0 > 1, the disease spreads; if R0 < 1, the disease disappears. Governments thus have imposed draconian constraints on people’s mobility in an attempt to reduce R0 during the coronavirus outbreak.
It is important to note that R0 measures the potential transmission of a disease, not the rate at which the disease spreads. Consider the ubiquitous nature of influenza viruses, which have an R0 of only about 1.3. A high R0 is a cause for concern but not a cause for panic.
R0 is an average, so it can be influenced by factors such as the number of “super-spreaders” in a given population. A super-spreader is an infected individual who infects an unexpectedly large number of people. Super-spreader events occurred during the SARS and MERS epidemics as well as the current pandemic. Such events are not necessarily a bad sign, however, because they may indicate that fewer people are perpetuating an epidemic. Super-spreaders may also be easier to identify and contain because their symptoms are likely to be more severe.
In short, R0 is a moving target. Tracking each case and the transmission of the disease is extremely difficult, so estimating R0 is complex and challenging. Estimates often change with the availability of new data.
To help authorities get R0 under control, the use of AI, together with data collection from the GPS tracking of mobile phones, allows creation of analytical models to predict which neighborhoods are more likely to have cases and those in which urgent intervention is needed.
During an epidemic, clinical data can be highly variable in quality and consistency. Complications of this sort include cases of false-positive patients. Big data and AI can be employed to check compliance with quarantines, however, and machine learning can be used for drug research.
The coronavirus response in Asia provides many examples of interventions implemented through the use of digital technologies. Drones equipped with smart scanners and cameras provide the ability to detect those who do not comply with quarantine measures and to check people’s body temperature. China and Taiwan have employed intelligent cameras for this purpose.
Hong Kong-based AI technology company SenseTime has developed a platform that can detect fever by scanning people’s faces even if they are wearing a medical mask. SenseTime’s contactless temperature-detection software has been implemented in subway stations, schools, and public centers in Beijing, Shanghai, and Shenzhen.
Alibaba, meanwhile, has developed an AI-based system for Covid-19 diagnosis that allows the detection of new coronavirus cases with an accuracy rate of up to 96% by means of computer tomographic scans (CT scans).
New York-based Graphen is collaborating with Columbia University researchers to define the canonical form of each gene localization of the virus and identify the exact variant(s). The researchers are using Graphen’s Ardi AI platform, which mimics the functions of the human brain, to store the mutation data and visualize them. A typical visualization maps a virus against a set of viruses possessing the same genome sequence. Virus-related information, including location, gender, and age of those affected, can be seen by clicking the corresponding nodes.
Big data, meanwhile, has been widely used to improve surveillance systems in order to map the spread of the virus.
The acquisition and processing of big data have required new methodologies and technologies for collection and analysis. In particular, we can distinguish four methodologies for big-data analysis:
Alibaba also developed an app, Alipay Health Code, that uses the big data made available by the Chinese health-care system to indicate who can or cannot access public spaces.
BlueDot, a Toronto-based startup with a platform built around artificial intelligence, has developed intelligent systems to enable automatic monitoring and prediction of the spread of infectious diseases. The BlueDot platform was used and its efficacy borne out during the SARS epidemic.
Notably, in December 2019, BlueDot also raised the alarm about the potential severity of the coronavirus, and again, its models proved correct. Among the tools used by BlueDot are natural-language–processing techniques.
Insilico Medicine (Rockville, Maryland) is another company focused on disease prevention through artificial intelligence. The company is developing and applying next-generation AI and deep-learning approaches to every step of the drug discovery and drug development process. Insilico recently used its system to analyze molecules that might be suitable for fighting the novel coronavirus and is able to share the results. As this issue went to press, the company was curating a database of information for use in vaccine development.
Aside from the effects on health, Covid-19 has dealt a devastating body blow to the global economy. Here, too, big data and AI can help analyze the impact and formulate appropriate responses. Satellite analysis technologies, for example, have helped WeBank researchers identify the industries most affected in China, such as steel. The analysis showed that production at China’s steel mills had dropped to a minimum of 29% of capacity early in the epidemic but had recovered to 76% of capacity by Feb. 9 (Figure 3).
The researchers then looked at other types of production and commercial activities using AI. One approach was simply to count the cars in large parking lots. This analysis showed that, as of Feb. 10, Tesla’s car production in Shanghai had fully recovered, whereas tourism venues, such as Shanghai Disneyland, remained closed.
By analyzing the GPS satellite data, it was possible to identify which people were commuting. The software then counted the number of commuters in each city and compared the number of commuters at the start of the Chinese New Year holiday in 2019 and on the corresponding date in 2020. In both years, commuter volume dropped off at the start of the holiday, but this year, normal volume did not resume after the holiday as it had in 2019.
As activity slowly recovered, WeBank researchers calculated that by March 10, 2020, about 75% of the workforce had returned to work. Projecting from these curves, the researchers concluded that most Chinese workers, with the exception of those in Wuhan, would return to work by the end of March.
Those attempting to respond to the coronavirus challenge have powerful tools at their disposal, and solutions that prove their value during the crisis could well become standard practice after it is resolved.