LSTM-Based Analysis of De-Identification Techniques for Protecting Sensitive Data

Cik Feresa Mohd  Foozy; K. Ravindran; Naqliyah Zainuddin; Ahmad S. Mohd Rozi; Muhammad H. A. Fakhrudin

doi:10.69513/jnfit.v1.i0.a1

Authors

Cik Feresa Mohd Foozy Author
K. Ravindran Author
Naqliyah Zainuddin Author
Ahmad S. Mohd Rozi Author
Muhammad H. A. Fakhrudin Author

DOI:

https://doi.org/10.69513/jnfit.v1.i0.a1

Abstract

This research examines the efficiency of de-identification techniques in enhancing privacy protections for sensitive data using Long Short-Term Memory (LSTM) models. Following a structured five-step methodology such as Dataset Collection, Data Preparation, Feature Extraction, Classification, and Performance Evaluation. The study evaluates LSTM’s performance of dataset based on Resume, Construction, and Medical domains. The primary goal is to examine the ability of de-identification methods to hide certain information based on classification accuracy. Results indicate that LSTM achieves accuracy levels 97.14% on unmodified data, explaining its success detecting sensitive information. However, after applying de-identification using Java Programming at pre-processing phase to eliminate sensitive keyword, the accuracy drops to 78.30%.These findings highlight the effectiveness of de-identification techniques to enhance data privacy, especially in fields that require strict confidentiality.

LSTM-Based Analysis of De-Identification Techniques for Protecting Sensitive Data

Authors

DOI:

Abstract

Downloads

Published

Issue

Section

License

Latest publications

Information

Language

Make a Submission