Balancing Privacy and Big Data Analytics

By: Claire Lu

Edited By: Dheven Unni and Joni Rosenberg

The production, consumption, and distribution of data generate tremendous social and economic value. As data analytics and digital networks advance and expand, the information accessible to individuals, businesses, and other entities has multiplied. Concurrently, the amount of data generated each day has exceeded the capabilities of traditional data collection methods and thereby engendered the growth of big data analytics. [1]

Big data refers to extremely large data sets rapidly collected from a variety of sources. Following this definition, big data is characterized by the three Vs: volume, velocity, and variety.[ 2]  With the growth of big data, analysts have gained the ability to glean valuable information from large and confusing datasets. This has meaningful applications in healthcare and education, and has also led to several major advances in scientific research. [3] For example, Kaiser Permanente was able to use big data to trace 27,000 cardiac arrests to the drug Vioxx, leading to the drug’s removal from the market. Without big data, researchers may not have been able to connect the drug with its side effects. [4]

While big data undoubtedly brings significant benefits to society, the rise of big data analytics creates new, unprecedented issues. For instance, big data raises questions about the ownership of data and can be used to exclude marginalized groups from opportunities by only advertising benefits such as credit card loans to certain groups. [5] The most prominent issue, however, is preserving privacy despite the pervasiveness of big data. Personal information is scrutinized, and the amount of detail that big data can extrapolate often feels invasive. To compound this issue, the current process of data collection is remarkably opaque, and the lack of a digitally-educated public makes it difficult for consumers to weigh in on questions of individual privacy. [6] While the anonymization of data has been the paradigm in research, it is always possible that individuals can be re-identified from anonymized data. [7] As a result, it is important that the law devises comprehensive measures that protect individuals and govern our data-driven society without stifling it.

Current national privacy law is too limited in scope. The only federal statutes concerning privacy are the Family Educational Rights and Privacy Act (FERPA) and the Health Insurance Portability and Accountability Act (HIPAA), which protect the privacy of students and medical patients respectively. Both standards were created prior to the big data boom and only regulate certain types of data. [8] In addition, the White House’s attempt to address privacy in the proposed Privacy Bill of Rights was criticized for its lack of new legislative changes. [9] Thus, it is clear that existing privacy laws must be reexamined. While many call for a radical European Union-style overhaul of privacy laws—an approach that focuses on reducing data collection and regulating its applications—there comes a need to balance the priorities of the individual and the collective. We should remain aware of big  data’s potential while recognizing its ethical and practical shortcomings.

It is unproductive and unrealistic for individuals to opt-out of data collection methods in the twenty-first century. The public shares a stake in analyzing pandemics, identifying effective medicine, improving the policing system, and several other issues that big data can help address. [10, 11] Thus, society benefits when individuals consent to data collection. At the same time, individual privacy concerns are legitimate and cannot be dismissed. Privacy and progress, however, should not be mutually exclusive. Policymakers should create a model that weighs and evaluates the legitimacy of data processing while better regulating big data collection. Given that the uses of big data are diverse and competing interests are rampant, simple solutions will often fall short. [12] There are still several viable approaches, however, that could allow big data to operate more ethically.

Data collection methods need to provide individuals with more agency and choice. Most third-party data collection is virtually invisible in the status quo, and consumers are seldom aware that they are sharing their data. When present, privacy policies posted on websites give individuals only the illusion of security, as they rarely advertise the extent of data collection [13]. More emphasis should be placed on informed consent. The current legal notions of informed consent involve four components: disclosure, competency, decision capacity, and documentation of consent. These ideas are most commonly applied in healthcare, where healthcare professionals take careful measures to ensure that their patients are making informed decisions about medical procedures. [14] A similar approach can be adapted and applied to data collection, as consumers should be able to have some degree of control over the collection and distribution of their data. Algorithmic transparency and clear policies communicated to a digitally-educated public can help consumers and collectors reach a mutual understanding. [15]

In addition, unlimited storage has caused a disturbing increase in the length of data retention. While rights of erasure are codified in the EU, the US lacks methods to enforce privacy rights. This can be attributed to their different approaches to privacy laws—in the US, the right to privacy is largely interpreted as “the right to be left alone” as established in A Treatise on Law of Torts or, in other words, the right to privacy dictates that the government should not invade the privacy of its citizens, but it is not obligated to intervene and keep companies accountable for invasions of privacy. In contrast, the EU has taken more proactive steps to secure the privacy of all its citizens. [16] Both are viable perspectives, but as the threat to privacy has become more immediate, the US should consider implementing provisions that limit unnecessary accumulation of personal data.

The ubiquity of big data presents both opportunities and challenges. While it may be tempting to glorify the advances made by big data, equal attention should be paid to its drawbacks. The law must develop more proactively to the changing landscape of technology by creating laws that hold companies accountable for invasions of privacy. A more aggressive approach to privacy law is necessary to adequately protect consumers and address the complex threat to privacy posed by big data.

NOTES:

  1. “A Deluge of Data Is Giving Rise to a New Economy,” The Economist (The Economist Newspaper), accessed January 4, 2021, https://www.economist.com/special-report/2020/02/20/a-deluge-of-data-is-giving-rise-to-a-new-economy.

  2. “Big Data Analytics,” IBM, accessed January 3, 2021, https://www.ibm.com/analytics/hadoop/big-data-analytics.

  3.  Ibid.

  4. Gardiner Harris, Barry Meier, and Andrew Pollack, “Despite Warnings, Drug Giant Took Long Path to Vioxx Recall,” The New York Times (The New York Times, November 14, 2004), https://www.nytimes.com/2004/11/14/business/despite-warnings-drug-giant-took-long-path-to-vioxx-recall.html.

  5. Jonas Lerman, “Big Data and Its Exclusions,” SSRN Electronic Journal, September 3, 2013, https://doi.org/10.2139/ssrn.2293765.

  6. Charith Perera et al., “Big Data Privacy in the Internet of Things Era,” IT Professional 17, no. 3 (2015): pp. 32-39, https://doi.org/10.1109/mitp.2015.34.

  7. Luc Rocher, Julien M. Hendrickx, and Yves-Alexandre De Montjoye, “Estimating the Success of Re-Identifications in Incomplete Datasets Using Generative Models,” Nature Communications 10, no. 1 (2019), https://doi.org/10.1038/s41467-019-10933-3.

  8. Cayce Myers, “Big Data, Privacy, and the Law: How Legal Regulations May Affect PR Research,” Institute for Public Relations, December 3, 2020, https://instituteforpr.org/big-data-privacy-and-the-law-how-legal-regulations-may-affect-pr-research/.

  9. “What Is the Consumer Privacy Bill of Rights and How Has It Evolved?,” Comparitech, November 27, 2018, https://www.comparitech.com/blog/vpn-privacy/consumer-privacy-bill-of-rights/.

  10. Jia, Qiong, Yue Guo, Guanlin Wang, and Stuart J. Barnes. “Big Data Analytics in the Fight against Major Public Health Incidents (Including COVID-19): A Conceptual Framework.” International Journal of Environmental Research and Public Health 17, no. 17 (2020): 6161. https://doi.org/10.3390/ijerph17176161. 

  11. IBM, “Big Data Analytics.”

  12. Pompeu Casanovas et al., “Regulation of Big Data: Perspectives on Strategy, Policy, Law and Privacy,” SSRN Electronic Journal, 2017, https://doi.org/10.2139/ssrn.2989689.

  13. Ibid.

  14. “Informed Consent,” Legal Information Institute (Legal Information Institute), accessed March 21, 2021, https://www.law.cornell.edu/wex/informed_consent#.

  15. Claudia E. Haupt, Jack M. Balkin, and Anita L. Allen, “Protecting One's Own Privacy in a Big Data Economy,” Harvard Law Review, December 9, 2016, https://harvardlawreview.org/2016/12/protecting-ones-own-privacy-in-a-big-data-economy/.

  16. Pompeu Casanovas et al., “Regulation of Big Data.”

BIBLIOGRAPHY:

“Big Data Analytics.” IBM. Accessed January 3, 2021. https://www.ibm.com/analytics/hadoop/big-data-analytics. 

Casanovas, Pompeu, Louis De Koker, Danuta Mendelson, and David Watts. “Regulation of Big Data: Perspectives on Strategy, Policy, Law and Privacy.” SSRN Electronic Journal, 2017. https://doi.org/10.2139/ssrn.2989689. 

“A Deluge of Data Is Giving Rise to a New Economy.” The Economist. The Economist Newspaper. Accessed January 3, 2021. https://www.economist.com/special-report/2020/02/20/a-deluge-of-data-is-giving-rise-to-a-new-economy. 

Harris, Gardiner, Barry Meier, and Andrew Pollack. “Despite Warnings, Drug Giant Took Long Path to Vioxx Recall.” The New York Times. The New York Times, November 14, 2004. https://www.nytimes.com/2004/11/14/business/despite-warnings-drug-giant-took-long-path-to-vioxx-recall.html. 

Haupt, Claudia E., Jack M. Balkin, and Anita L. Allen. “Protecting One's Own Privacy in a Big Data Economy.” Harvard Law Review, December 9, 2016. https://harvardlawreview.org/2016/12/protecting-ones-own-privacy-in-a-big-data-economy/. 

“Informed Consent.” Legal Information Institute. Legal Information Institute. Accessed March 21, 2021. https://www.law.cornell.edu/wex/informed_consent#.

Jia, Qiong, Yue Guo, Guanlin Wang, and Stuart J. Barnes. “Big Data Analytics in the Fight against Major Public Health Incidents (Including COVID-19): A Conceptual Framework.” International Journal of Environmental Research and Public Health 17, no. 17 (2020): 6161. https://doi.org/10.3390/ijerph17176161. 

Lerman, Jonas. “Big Data and Its Exclusions.” SSRN Electronic Journal, September 3, 2013. https://doi.org/10.2139/ssrn.2293765. 

Myers, Cayce. “Big Data, Privacy, and the Law: How Legal Regulations May Affect PR Research.” Institute for Public Relations, December 3, 2020. https://instituteforpr.org/big-data-privacy-and-the-law-how-legal-regulations-may-affect-pr-research/. 

Perera, Charith, Rajiv Ranjan, Lizhe Wang, Samee U. Khan, and Albert Y. Zomaya. “Big Data Privacy in the Internet of Things Era.” IT Professional 17, no. 3 (2015): 32–39. https://doi.org/10.1109/mitp.2015.34. 

Rocher, Luc, Julien M. Hendrickx, and Yves-Alexandre De Montjoye. “Estimating the Success of Re-Identifications in Incomplete Datasets Using Generative Models.” Nature Communications 10, no. 1 (2019). https://doi.org/10.1038/s41467-019-10933-3. 

“What Is the Consumer Privacy Bill of Rights and How Has It Evolved?” Comparitech, November 27, 2018. https://www.comparitech.com/blog/vpn-privacy/consumer-privacy-bill-of-rights/.