TR2024-168

Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage

- Rashid, M.R.U., Liu, J., Koike-Akino, T., Mehnaz, S., Wang, Y., "Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage", Red Teaming GenAI Workshop at Neural Information Processing Systems (NeurIPS), December 2024.
  BibTeX TR2024-168 PDF
  - @inproceedings{Rashid2024dec,
  - author = {Rashid, Md Rafi Ur and Liu, Jing and Koike-Akino, Toshiaki and Mehnaz, Shagufta and Wang, Ye},
  - title = {{Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage}},
  - booktitle = {Red Teaming GenAI Workshop at Neural Information Processing Systems (NeurIPS)},
  - year = 2024,
  - month = dec,
  - publisher = {OpenReview},
  - url = {https://www.merl.com/publications/TR2024-168}
  - }
MERL Contacts:
Research Areas:

Artificial Intelligence, Machine Learning

Abstract:

Fine-tuning large language models on private data for downstream applications poses significant privacy risks in potentially exposing sensitive information. Several popular community platforms now offer convenient distribution of a large variety of pre-trained models, allowing anyone to publish without rigorous verification. This scenario creates a privacy threat, as pre-trained models can be intentionally crafted to compromise the privacy of fine-tuning datasets. In this study, we introduce a novel poisoning technique that uses model-unlearning as an attack tool. This approach manipulates a pre-trained language model to increase the leakage of private data during the fine-tuning process. Our method enhances both membership inference and data extraction attacks while preserving model utility. Experimental results across different models, datasets, and fine-tuning setups demonstrate that our attacks significantly surpass baseline performance. This work serves as a cautionary note for users who download pre-trained models from unverified sources, highlighting the potential risks involved.

Related Publications

Rashid, M.R.U., Liu, J., Koike-Akino, T., Wang, Y., Mehnaz, S., "Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage", AAAI Conference on Artificial Intelligence, Toby Walsh, Julie Shah, Zico Kolter, Eds., DOI: 10.1609/aaai.v39i19.34218, February 2025, pp. 20139-20147.

BibTeX TR2025-017 PDF

@inproceedings{Rashid2025feb,
author = {Rashid, Md Rafi Ur and Liu, Jing and Koike-Akino, Toshiaki and Wang, Ye and Mehnaz, Shagufta},
title = {{Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage}},
booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence},
year = 2025,
editor = {Toby Walsh, Julie Shah, Zico Kolter},
pages = {20139--20147},
month = feb,
publisher = {Association for the Advancement of Artificial Intelligence (AAAI)},
doi = {10.1609/aaai.v39i19.34218},
issn = {2374-3468},
isbn = {978-1-57735-897-8},
url = {https://www.merl.com/publications/TR2025-017}
}

Rashid, M.R.U., Liu, J., Koike-Akino, T., Mehnaz, S., Wang, Y., "Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage", arXiv, August 2024.

BibTeX arXiv

@article{Rashid2024aug,
author = {Rashid, Md Rafi Ur and Liu, Jing and Koike-Akino, Toshiaki and Mehnaz, Shagufta and Wang, Ye},
title = {{Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage}},
journal = {arXiv},
year = 2024,
month = aug,
url = {https://arxiv.org/abs/2408.17354}
}

TR2024-168

Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage

MERL Contacts:

Jing
Liu

Toshiaki
Koike-Akino

Ye
Wang

Research Areas:

Abstract:

Related Publications

MERL Contacts:

JingLiu

ToshiakiKoike-Akino

YeWang

Research Areas:

Abstract:

Jing
Liu

Toshiaki
Koike-Akino

Ye
Wang