free amp template

Sensitive Information Detection and Reporting (SIDR)

Team Red, CS 410 Spring 2020

Problem Statement

For enterprises of any size, the release of sensitive information can provide bad actors unauthorized access to private digital assets, potentially resulting in tarnished reputations and costing millions of dollars in damage.

Problem Characteristics

  • Large breaches can develop from small mistakes
  • Private or confidential information is sometimes placed in publicly accessible files or web pages
  • Public file/web page directories may be shared with individuals who are no longer active
  • Humans can not manually search everywhere that potentially contains their confidential information

Problem History

  • 60% of 4856 organizational data breaches reported in 2019 were the result of human error (Winder, 2019).
  • According to IBM's 2019 Cost of a Data Breach Report, the average cost is $3.9 million (IBM, 2019).

Solution Statement

By employing a convenient and actively maintained web crawler or file scanner to scour web content for sensitive data, enterprises will be kept notified of potential data leaks and unauthorized access to sensitive information.

Solution Characteristics

  • Prevents data breaches that would have been caused by human error
  • Allows for the correction of small mistakes that could potentially become a large data breach
  • Allows automated searching of every local and public location that could potentially contain confidential information
  • Finds private or confidential information that was mistakenly put on public files or web pages
  • Allows searching of public file/web page directories that are no longer actively maintained

Team

The Red Team is comprised of students at Old Dominion University who oversee the development of Sensitive Information Detection & Reporting (SIDR), software that will help the University's Computer Science Department identify data leaks.

Andrew Paterson

Software Developer & Machine-Learning Specialist

Andrew is a senior in the Computer Science program at Old Dominion University. He will graduate in the Fall of 2020 with plans of becoming a full-time software developer. Some of Andrew's hobbies include surfing, weight-lifting, and playing soccer. As the Machine-Learning Specialist, he is an charge of designing SIDR's lexical analyzer and custom regular expression generator.

Cameron Allen

Project Lead

Cameron is a senior majoring in Computer Science and minoring in Cybersecurity at Old Dominion University. He has real world experience developing professional applications in C#, C++, and Python, as well as experince protecting the digital network of one of seventeen U.S. Department of Energy national laboratories. His long term career dream is to become a web application penetration tester. In his free time he often plays video games, but also enjoys reading, hiking, and coding for personal projects. He is in charge of designing the web crawler component.

Dane Bruce

Web Developer & Front-End Specialist

Dane is a senior in Computer Science through Old Dominion University's Online curriculum. He currently lives in Charleston, SC with his wife and their Italian Greyhound, Nero. In his spare time he enjoys video games, light video game modding, woodworking, the beach, and Charleston's wonderful food. As the Front-End Specialist, he is in charge of designing the UI.

Evan Mulloy

Web Developer & Back-End Specialist

Evan Mulloy is a Computer Science major and Cybersecurity minor who first began learning how to program at the age of 12. A problem-solver at heart, he enjoys making functional programs and websites. C# is his favorite programming language, but he's also familiar with Java, JavaScript, C, C++, PHP, and Python. He has experience as a developer for an MMOG server emulator project. Currently, he is an intern for a management consulting company that specializes in IT Modernization and Cybersecurity. As the Back-End Specialist, he is in charge of using APIs and programming the underlying logic of the file scanner component.

Michael Hewitt

Documentation Specialist

Michael is a senior undergraduate and Computer Science major at Old Dominion University. In addition to being a student, Michael lives in Fairfax, VA, and works as a Cybersecurity Engineer and Advisor in the Washington, D.C. metro area. As a security engineer, Michael is responsible for vulnerability and configuration management, advises his customers on the implementation of new security tools and legal requirements, and writes systems documentation, standard operating procedures, and proposals. When he has free time outside of family, work, and school, he enjoys TV shows, video games, podcasts, books, and audiobooks. As the Documentation Specialist, he is in charge of overseeing which documents are published by the web developers. He also oversees the Machine-Learning aspects of SIDR along with Andrew Paterson.

Kasey Howlett

Database Engineer

Kasey Howlett is a senior Computer Science student at Old Dominion University. She is currently working at Newport News Shipbuilding as a Data Science Co-op. Her biggest interest outside of computer science is baking. Kasey intends to get a job in the field of either Data Science of Data Engineering. As the Database Specialist, she is in charge of overseeing all aspects related to the SIDR databases.

References

Adams, A., & Sasse, M. A. (1999). Users are not the enemy. Communications of the ACM, 42(12), 40-46. https://dl.acm.org/doi/10.1145/322796.322806

Bakker, R. (2018, December 5). Evolving Regular Expression Features for Text Classification with Genetic Programming (Publication No. 10548017) [Master’s thesis, University of Amsterdam]. https://esc.fnwi.uva.nl/thesis/centraal/files/f565297164.pdf

Cisco. (2018). Small and Mighty: How Small and Midmarket Businesses Can Fortify Their Defenses Against Today's Threats. https://www.cisco.com/c/dam/global/hr_hr/solutions/small-business/pdf/small-mighty-threat.pdf

Columbus, L. (2019, February 26). 74% Of Data Breaches Start With Privileged Credential Abuse. Forbes. https://www.forbes.com/sites/louiscolumbus/2019/02/26/74-of-data-breaches-start-with-privileged-credential-abuse

Funke, D. (2019, September 23). Public data breaches have increased over the past decade [Graph]. PolitiFact. https://www.politifact.com/article/2019/sep/23/numbers-how-common-are-data-breaches-and-what-can-/

Gartner. (2018, December 6). Gartner Data Shows 87 Percent of Organizations Have Low BI and Analytics Maturity. https://www.gartner.com/en/newsroom/press-releases/2018-12-06-gartner-data-shows-87-percent-of-organizations-have-low-bi-and-analytics-maturity

Goodin, D. (2013, January 24). PSA: Don’t upload your important passwords to GitHub. Ars Technica. https://arstechnica.com/information-technology/2013/01/psa-dont-upload-your-important-passwords-to-github/

Goodin, D. (2014, January 8). Hackers use Amazon cloud to scrape mass number of LinkedIn member profiles. Ars Technica. https://arstechnica.com/information-technology/2014/01/hackers-use-amazon-cloud-to-scrape-mass-number-of-linkedin-member-profiles/

International Business Machines. (2019). Average cost of data breach [Infographic]. Mybluemix. https://databreachcalculator.mybluemix.net/

Johnson, A. (2020, February 3). Even Public, Visible Data on Your Website Can Benefit Hackers. TechyGeeksHome. https://blog.techygeekshome.info/2020/02/even-public-visible-data-on-your-website-can-benefit-hackers/

Kirby, D. (2018, May 21). Five Types of Sensitive Data Almost All Companies Handle. Kirbside Consulting. https://kirbside.com/blog/five-types-of-sensitive-data-almost-all-companies-handle/

Laney, D. (2018, October 22). Gartner's Enterprise Information Management Maturity Model. Gartner. https://www.gartner.com/document/3236418

Larkey, S. N. (2019). Exploring the Strategies Cybersecurity Specialist Need to Minimize Security Risks in Non-profit Organizations (Publication No. 27540543) [Doctoral dissertation, Colorado Technical University]. ProQuest Dissertations Publishing.

Moorcraft, B. (2019, April 16). Non-profits are a target for data breach. Insurance Business. https://www.insurancebusinessmag.com/us/news/non-profits/nonprofits-are-a-target-for-data-breach-165039.aspx

Rosati, P., Deeney, P., Cummins, M., Van der Werff, L., & Lynn, T. (2019). Social media and stock price reaction to data breach announcements: Evidence from US listed companies. Research in International Business and Finance, 47, 458-469. https://doi-org.proxy.lib.odu.edu/10.1016/j.ribaf.2018.09.007

SecurityTrails Team. (2018, November 27). Top 5 Ways to Handle a Data Breach. SecurityTrails. https://securitytrails.com/blog/top-5-ways-handle-data-breach

Steinberg, J. (2018, April 28). 12 Types Of Data That Businesses Need To Protect But Often Do Not. https://josephsteinberg.com/12-types-of-data-that-businesses-need-to-protect-but-often-do-not/

Thomas, F., Li, F., Zand, A., Barrett, J., Ranieri, J., Invernizzi, L., Markov, Y., Comanescu, O., Eranti, V., Moscicki, A., Margolis, D., Paxson, V., & Bursztein, E. (2017). Data breaches, phishing, or malware? Understanding the risks of stolen credentials. CCS ‘17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 1421-1434. https://doi.org/10.1145/3133956.3134067

Verizon. (2019). 2019 Data Breach Investigation Report. https://enterprise.verizon.com/resources/reports/dbir

Winder, D. (2019, August 20). Data Breaches Expose 4.1 Billion Records In First Six Months Of 2019. Forbes. https://www.forbes.com/sites/daveywinder/2019/08/20/data-breaches-expose-41-billion-records-in-first-six-months-of-2019/