Skip Navigation

Who Owns My Data? An Analysis of Healthcare Data Breach Trends Since COVID-19

Image from World Health Organization


In February 2023, Managed Care of North America (MCNA) Dental, a leading dental care and oral health insurance provider for government-sponsored programs like Medicaid and the Children’s Health Insurance Program (CHIP) in the United States, experienced a significant data breach. The incident began on February 26, when malicious software penetrated the company’s digital systems. Over the following days, LockBit, the world’s most active ransomware group, announced that they had accessed 700GB of sensitive personal information from the breach and subsequently published this data on their website. The breach affected roughly 8.9 million patients and included a wide array of Protected Health Information (PHI), such as names, addresses, telephone numbers, email addresses, birthdates, and Social Security numbers. 

The incident at MCNA is far from a standalone occurrence. Since the beginning of the Covid-19 pandemic, the healthcare sector’s rapid digitization has left it increasingly susceptible to data breaches. A report from Atlas VPN in early 2023 highlights this vulnerability, revealing that 41 million individuals were affected by healthcare data breaches in the first half of the year alone This figure is not just alarming in its magnitude—it also signifies a shift in the nature of cyber threats within the healthcare industry, particularly in its digital infrastructure. 

The pandemic’s role in this shift is critical. By forcing an accelerated transition to digital healthcare services, it has significantly broadened the attack surface accessible to cybercriminals. The transition has exposed critical gaps in data security and privacy protections and led to a concerning change in breach dynamics. While the overall number of data breaches in healthcare might have declined, the scale and impact of each incident, as evidenced by the number of individuals affected, have notably increased. 

This trend raises the pertinent question: Which healthcare entities are most susceptible to breaches, and what specific types of data are the cybercriminals targeting? By analyzing recent patterns of data breaches within the healthcare sector and understanding common problems across the diverse entities impacted, we can gain insights into the broader implications of patient data management in our modern healthcare system. Furthermore, this analysis can lead to the formulation of technical and policy-based solutions to improve data security and reduce breaches in this critical area.


The dataset employed for this analysis was sourced from the US Department of Health and Human Services Office for Civil Rights Breach Portal. The initial Exploratory Data Analysis (EDA) of the dataset reveals crucial details such as breach types, affected entities, and the number of individuals impacted. Business associates—separate healthcare entities that perform tasks on behalf of a Health Insurance Portability and Accountability Act (HIPAA)-covered entity—were the most affected type of entity, with breaches impacting an average of approximately 312,968 individuals. On the other hand, breaches of health plans—entities that actively provide healthcare, such as Medicare and insurance companies—affected an average of about 123,625 individuals. When scrutinizing breach types, “Hacking/IT Incident” breaches, defined as deliberate cyberattacks on data management systems, affected the most people per breach (157,696 individuals), and “Unauthorized Access/Disclosure” breaches affected the second-most (71,469 individuals) on average, as demonstrated by the bar graph below. 

Delving deeper, the analysis of Hacking/IT breach locations uncovered that breaches involving network servers were most common, with 586 incidents impacting an average of 180,671 individuals each. Since network servers often act as the nerve center for data storage, breaches can lead to the extensive exposure of patient data. A staggering 74.8 percent of IT hacking incidents were linked to network servers, highlighting the importance of improving server security in healthcare systems. Email-related breaches represent a smaller but significant concern as the second-most common breach in both Hacking and Unauthorized Access breaches. However, they affect 18,000 people on average, a magnitude lower than network server breaches.

Temporal trend analysis underscores a persistent and escalating threat of “Hacking/IT Incident” breaches within healthcare infrastructure. The analysis focused on the period between December 2021 and July 2023 to ensure an adequate number of cases for analysis. During this period, there was a significant increase of 167 percent in the frequency of “Hacking/IT Incident” breaches, surging from 18 incidents in December 2021 to 48 incidents in July 2023. Interestingly, this trend contradicts the overall decrease in healthcare data breaches during the same period. Hackers have become more strategic, focusing their efforts on deliberate and precise targeting of valuable data stored on network servers and in other repositories. The upward trend in targeted cyberattacks calls for an urgent reinforcement of security measures across a range of breach locations in the healthcare infrastructure, from network servers to email systems to other repositories of patient information.


In the quest to secure patient data, technological solutions are the initial line of defense for healthcare organizations. Network segmentation, which involves dividing healthcare networks into smaller segments to limit exposure in case of a breach, stands out as a critical strategy. Research has shown that segregating vulnerable medical devices on dedicated Virtual Local Area Networks (VLANs) is an effective approach that can be adopted across healthcare organizations to secure patient data transmitted between devices. In this context, VLANs serve as specialized virtualized networks designed to segregate and secure vulnerable medical devices, ensuring that any breach that occurs is contained within a specific segment. Additionally, integrating Intrusion Detection Systems (IDS) with network segmentation strengthens the defense against data breaches. This combination is essential for early threat detection, particularly in mitigating breaches caused by unauthorized access. IDS plays a crucial role in monitoring and protecting not only patient data but also communications via email, a common point of vulnerability. By consistently monitoring network segments, IDS acts as a guardian that identifies and responds to potential breaches. Collaboration with industry leaders like Cisco, who offer resources and products for implementing network segmentation and IDS, is a valuable step in establishing secure systems. For better enforcement and accountability, politicians and policymakers should advocate for the incorporation of these technical standards into legal frameworks and require companies and organizations to adhere to them. This proactive approach to data security can help protect patient privacy and reduce the risk of data breaches in the healthcare sector.

From a broader social and political perspective, we are compelled to recognize the substantial power imbalance between patients and healthcare organizations. Private data is subject to control and manipulation by healthcare entities, often leaving patients in the dark. As a result, when considering healthcare policy, a patient-centric approach to data collection is key to addressing such power imbalances. Healthcare organizations must be held accountable and rigorously enforce adherence to established standards like HIPAA, ensuring that only the most important information is gathered for delivering comprehensive patient care. This not only upholds patient privacy but also serves as a proactive measure to minimize the amount of sensitive information that could be exposed in data breaches. Additionally, it is crucial to obtain informed consent from patients for data collection and sharing information to foster trust between healthcare providers and patients. Furthermore, transparency in healthcare data management, particularly in the context of patient records, is critical. Healthcare providers should take steps to communicate their data collection and share their policies to patients, for instance, by including a clear outline of the data collected, its purpose, and the security measures in place. 

To facilitate this broader shift towards a patient-centric approach in healthcare data management, it is crucial that healthcare organizations, particularly nonprofits operating under the constraints of complex regulatory frameworks, are provided with the necessary resources to establish comprehensive security measures. This transformation requires active engagement from stakeholders across the healthcare data ecosystem. They should invest in robust cybersecurity measures, forge collaborative partnerships to address financial and regulatory challenges, and commit to continuous education to stay ahead of evolving threats. Furthermore, the provision of educational resources plays a pivotal role in bridging knowledge gaps, enabling healthcare entities to understand and implement best practices for data protection. 

By advocating for improved technological standards, fostering greater collaboration among companies, organizations, and government institutions, and leveraging technology, policy, and legal tools, we can empower patients to exert greater control over the management of their healthcare data. Beyond showing respect for patients’ rights, such advocacy represents significant progress toward establishing a transparent healthcare system that is centered around the needs and preferences of patients. Such a transformation not only has the potential to prevent tens of millions of individuals from experiencing health information breaches each year, but also ensures the overall integrity and effectiveness of healthcare data management, particularly in the context of ongoing digitization. Ultimately, it creates a healthcare system where patients’ privacy is not only respected but rigorously protected, contributing to a more secure and patient-focused healthcare environment.


The dataset employed—sourced from the US Department of Health and Human Services Office for Civil Rights Breach Portal—offers an extensive view into incidents of unsecured protected health information affecting 500 or more individuals in the past 24 months (when the article was written, it was between July 2021 and July 2023), with a total of 896 data points. The Python language, with its diverse libraries such as pandas for data processing, matplotlib, and seaborn for visualization, was used to conduct the data analysis. The analysis consisted of an Exploratory Data Analysis (EDA), an analysis of the distribution of the number of individuals affected by each breach type (visualized in the bar graph), and a temporal analysis of the change in number of Hacking/IT incidents between November 2021 and July 2023 (visualized in the line graph). For more details on the code, please see this Colab Notebook.