Algorithm-based Taxonomy of Data Breaches: Defining the Impacts of Digital theft and understanding its threat to United States Government Facilities

QR Code

Ada PETER and Bryan EMEJOR

Covenant University, Ota, Nigeria

Abstract

In 2020, four United States key federal agencies, from the Department of Homeland Security to the agency that oversees America’s nuclear weapons arsenal to tech and security companies, including Microsoft, were breached. Weeks after the United States government announced that multiple federal agencies had been targeted, the full scope and consequences of the suspected Russian hack remained unknown. Investigators struggled to determine what information the hackers may have stolen and what they could do with it. The struggle implied a lack of the scientific framework upon which governments can swiftly identify the possible scope and consequences of the data breaches in the government facilities. Hence, while previous studies may have developed some form of a data breach or cyber harm taxonomies, this study seeks to train a machine learning algorithm that will use existing taxonomy of the prevalence, incidence, and consequences of data breaches on the United States government facilities sector to predict future consequences of similar attacks. The study used available data to capture the prevalence, incidence, and implications of the data breaches on the government facilities sector then used the same to train an algorithm (LSVM) that can provide insight to possible consequences, response, and spread of new attacks. The scope and data used for the study are limited to data breaches that occurred in the United States government facilities between the years 2000 and 2021. The outcome of this is a machine learning tool that suggests and detects probable consequences of each type of data breach. The tool will be useful for researchers and practitioners alike to consider the full range of consequences that might result from different types of data breaches when developing response tactics. The tool is available on Streamlit:

https://share.streamlit.io/bryanemejor/data_breach_thesis/main/Stream_Bryan.py

Keywords: data, data breach, government-industry, hacking, phishing, Ransomware
Shares