Claude Shanon, in Information theory, asserts that as the level of information increases the uncertainty related to that event decreases. Going by the dictum, in a highly uncertain event like COVID-19, there is a lot of information to be gained. Countries like South Korea and Taiwan were severely affected by MERS and SARS in the past. It made them invest heavily in digital infrastructure and they reaped its benefits in COVID-19 by being one of the least affected countries. Indian Government also needs to focus on Data infrastructure to avoid uncertainties in the future.
Infrastructure such as roads, water, and electricity supply enables us to perform economic, social, and physical interactions. There is a need to establish sound data infrastructure, comprising both tangible and intangible resources, which will augment the services of physical infrastructure. A good data infrastructure would have provisions for creating data assets, data curation, and data storage. This vision requires the protection of individual privacy and national security.
The starting point of creating data assets is data collection, and the quality of data collected at this point determines the outcome. Chief Statistician of India, Pravin Srivastava, also agrees that data needs to be collected reliably and then collated. Traditional data collection methods like manual door-to-door data collection are an impossible task in COVID-19. To enable hassle-free data collection in times of pandemic, we need to utilize modern data collection methods. The particular focus must be on non-intrusive data collection. Poland has augmented its data collection with the use of computer-assisted Web interviews, personal interviews, and telephone interviews, which India can replicate. Individuals are creating a digital footprint every moment. Sentiment analysis and GIS analysis are useful tools to trace the footprint. The spread of COVID-19 could have been tracked with social media as done in the Nepal earthquake, where individuals marked them safe on Facebook.
In COVID-19, contact tracing would have become more manageable if mobile big data supplemented the data from the Aarogya Setu app. The data from the Aarogya Setu app gives an incomplete picture because not everyone in India has a smartphone and not everyone regularly uses the app. Currently, it has 150 million users in India, which is almost one-tenth of India’s population. India’s age group that contracted most COVID-19 infections is different from the trends worldwide, but it took us a lot of time to realize this. Proper data did not back the initial strategy of a nationwide lockdown, resulting in enormous economic and social losses. Reliable data was required to make timely and targeted decision-making.
Next, we need proper data storage systems and adequate channels for data distribution. The storage system should be fault resistant, efficient in space and capacity planning, scalable, flexible, and must contain multi-level provisions to provide data security. The storage system should have a large memory and lower latency to quickly process humongous data like mobile big data. In Balu Gopalakrishnan v/s State of Kerala, there were accusations against the Kerala Government of breaching privacy by allowing Sprinklr, a US-based company, to collect and process medical data to contain COVID-19. In defense, Kerala Government argued that Government owned entities were technically unequipped to store voluminous data. Such inadequacies can only be compensated by building data infrastructure. NIC has already made provisions for Meghraj to help the Government in e-governance and optimize ICT spending with the cloud initiatives. Its capacity can be increased to store and back up the data related to COVID-19. Another bottleneck in data storage is that data is stored in silos, usually with different agencies under various government ministries. It needs to be connected seamlessly to be used whenever required, focusing on the cloud.
Data curation is as vital as data collection and storage, and it involves data cleansing and data analysis. Curation can be done with a partnership between social scientists and data scientists and also, private individuals and public agencies. Such collaboration would have avoided inaccurate predictions made by V.K. Paul, a NITI Aayog member. In a presentation, he claimed that with the lockdown in place, India would have zero cases by May 16. Data analysis would also entail making individual details anonymous and observing the trends in the geospatial spread and monitoring the vulnerable sections so that mitigation of the pandemic can also be done. China has utilized high technology and big data analysis to augment government policymaking, giving it an edge in fighting COVID-19.
An essential aspect of enabling data infrastructure is to ensure the data privacy of individuals. Although data collection and processing are the best armaments in the Government’s arsenal in its fight against COVID-19 and similar pandemics, it shouldn’t come at the cost of citizens’ privacy and dignity. The recent cases involving Cambridge Analytica and the leak of Aadhar details in Gujarat and Jharkhand have raised concerns about data protection and privacy. The Government’s standing on collecting, storing, and transferring COVID-19 data has been obscure, coupled with a lack of transparency. The collection of data needs to take informed and meaningful consent from the individuals, and any breach of data privacy should involve penalties for unlawful data processing. The Government should spell out the agencies that will have the data and how that will be used. The correct approach would entail finding patterns in the data with proper analysis rather than storing the data for a long to be used in the future.
With rising global temperatures and proximity to wildlife, we are sure to witness more pandemics like COVID-19, which will threaten human existence and incur substantial economic losses. One of the ways to combat these uncertainties is to focus on building a lofty data infrastructure built with proper data collection methods, analysis tools, data dissemination, and provision for data privacy.
References:
- https://www.stateofopendata.od4d.net/chapters/issues/data-infrastructure.html
- https://timesofindia.indiatimes.com/blogs/voices/data-privacy-aarogya-setu-covid-19-app/
- https://cio.economictimes.indiatimes.com/news/strategy-and-management/why-redefining-data-storage-is-essential-for-digital-india/71832667
- https://globalfreedomofexpression.columbia.edu/cases/balu-gopalakrishnan-v-state-of-kerala-and-ors/
- https://www.teradata.com/Blogs/How-China-Used-Advanced-Analytics-During-the-Pandemic
- https://www.business-standard.com/article/economy-policy/covid-19-to-structurally-change-collection-of-data-for-official-statistics-120051000620_1.html