LSTM-Based Analysis of De-Identification Techniques for Protecting Sensitive Data
DOI:
https://doi.org/10.69513/jnfit.v1.i0.a1Abstract
This research examines the efficiency of de-identification techniques in enhancing privacy protections for sensitive data using Long Short-Term Memory (LSTM) models. Following a structured five-step methodology such as Dataset Collection, Data Preparation, Feature Extraction, Classification, and Performance Evaluation. The study evaluates LSTM’s performance of dataset based on Resume, Construction, and Medical domains. The primary goal is to examine the ability of de-identification methods to hide certain information based on classification accuracy. Results indicate that LSTM achieves accuracy levels 97.14% on unmodified data, explaining its success detecting sensitive information. However, after applying de-identification using Java Programming at pre-processing phase to eliminate sensitive keyword, the accuracy drops to 78.30%.These findings highlight the effectiveness of de-identification techniques to enhance data privacy, especially in fields that require strict confidentiality.
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Al-Noor Journal for Information Technology and Cybersecurity
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.