TR2025-017

Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage

- Rashid, M.R.U., Liu, J., Koike-Akino, T., Wang, Y., Mehnaz, S., "Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage", AAAI Conference on Artificial Intelligence, Toby Walsh, Julie Shah, Zico Kolter, Eds., DOI: 10.1609/aaai.v39i19.34218, February 2025, pp. 20139-20147.
  BibTeX TR2025-017 PDF
  - @inproceedings{Rashid2025feb,
  - author = {Rashid, Md Rafi Ur and Liu, Jing and Koike-Akino, Toshiaki and Wang, Ye and Mehnaz, Shagufta},
  - title = {{Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage}},
  - booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence},
  - year = 2025,
  - editor = {Toby Walsh, Julie Shah, Zico Kolter},
  - pages = {20139--20147},
  - month = feb,
  - publisher = {Association for the Advancement of Artificial Intelligence (AAAI)},
  - doi = {10.1609/aaai.v39i19.34218},
  - issn = {2374-3468},
  - isbn = {978-1-57735-897-8},
  - url = {https://www.merl.com/publications/TR2025-017}
  - }
MERL Contacts:
Research Areas:

Artificial Intelligence, Machine Learning

Abstract:

Fine-tuning large language models on private data for down- stream applications poses significant privacy risks in potentially exposing sensitive information. Several popular community platforms now offer convenient distribution of a large variety of pre-trained models, allowing anyone to publish without rigorous verification. This scenario creates a privacy threat, as pre-trained models can be intentionally crafted to compromise the privacy of fine-tuning datasets. In this study, we introduce a novel poisoning technique that uses model- unlearning as an attack tool. This approach manipulates a pre-trained language model to increase the leakage of private data during the fine-tuning process. Our method enhances both membership inference and data extraction attacks while preserving model utility. Experimental results across different models, datasets, and fine-tuning setups demonstrate that our attacks significantly surpass baseline performance. This work serves as a cautionary note for users who download pretrained models from unverified sources, highlighting the potential risks involved.

Related News & Events

NEWS MERL Papers and Workshops at AAAI 2025
Date: February 25, 2025 - March 4, 2025
Where: The Association for the Advancement of Artificial Intelligence (AAAI)
MERL Contacts: Toshiaki Koike-Akino; Jing Liu; Kuan-Chuan Peng; Diego Romeres; Ye Wang
Research Areas: Artificial Intelligence, Machine Learning, Optimization
Brief
- MERL researchers presented 2 conference papers, 2 workshop papers, and co-organized 1 workshop at the AAAI 2025 conference, which was held in Philadelphia from Feb. 25 to Mar. 4, 2025. AAAI is one of the most prestigious and competitive international conferences in artificial intelligence (AI). Details of MERL contributions are provided below.
  
  - AAAI Papers in Main Tracks:
  
  1. "Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage" by M.R.U. Rashid, J. Liu, T. Koike-Akino, Y. Wang, and S. Mehnaz. [Oral Presentation]
  
  This work proposes a novel unlearning-based model poisoning method that amplifies privacy breaches during fine-tuning. Extensive empirical studies show the proposed method’s efficacy on both membership inference and data extraction attacks. The attack is stealthy enough to bypass detection based defenses, and differential privacy cannot effectively defend against the attacks without significantly impacting model utility.
  
  Paper: https://www.merl.com/publications/TR2025-017
  
  2. "User-Preference Meets Pareto-Optimality: Multi-Objective Bayesian Optimization with Local Gradient Search" by J.H.S. Ip, A. Chakrabarty, A. Mesbah, and D. Romeres. [Poster Presentation]
  
  This paper introduces a sample-efficient multi-objective Bayesian optimization method that integrates user preferences with gradient-based search to find near-Pareto optimal solutions. The proposed method achieves high utility and reduces distance to Pareto-front solutions across both synthetic and real-world problems, underscoring the importance of minimizing gradient uncertainty during gradient-based optimization. Additionally, the study introduces a novel utility function that respects Pareto dominance and effectively captures diverse user preferences.
  
  Paper: https://www.merl.com/publications/TR2025-018
  
  - AAAI Workshop Papers:
  
  1. "Quantum Diffusion Models for Few-Shot Learning" by R. Wang, Y. Wang, J. Liu, and T. Koike-Akino.
  
  This work presents the quantum diffusion model (QDM) as an approach to overcome the challenges of quantum few-shot learning (QFSL). It introduces three novel algorithms developed from complementary data-driven and algorithmic perspectives to enhance the performance of QFSL tasks. The extensive experiments demonstrate that these algorithms achieve significant performance gains over traditional baselines, underscoring the potential of QDM to advance QFSL by effectively leveraging quantum noise modeling and label guidance.
  
  Paper: https://www.merl.com/publications/TR2025-025
  
  2. "Quantum Implicit Neural Compression", by T. Fujihashi and T., Koike-Akino.
  
  This work introduces a quantum counterpart of implicit neural representation (quINR) which leverages the exponentially rich expressivity of quantum neural networks to improve the classical INR-based signal compression methods. Evaluations using some benchmark datasets show that the proposed quINR-based compression could improve rate-distortion performance in image compression compared with traditional codecs and classic INR-based coding methods.
  
  Paper: https://www.merl.com/publications/TR2025-024
  
  - AAAI Workshops Contributed by MERL:
  
  1. "Scalable and Efficient Artificial Intelligence Systems (SEAS)"
  
  K.-C. Peng co-organized this workshop, which offers a timely forum for experts to share their perspectives in designing and developing robust computer vision (CV), machine learning (ML), and artificial intelligence (AI) algorithms, and translating them into real-world solutions.
  
  Workshop link: https://seasworkshop.github.io/aaai25/index.html
  
  2. "Quantum Computing and Artificial Intelligence"
  
  T. Koike-Akino served a session chair of Quantum Neural Network in this workshop, which focuses on seeking contributions encompassing theoretical and applied advances in quantum AI, quantum computing (QC) to enhance classical AI, and classical AI to tackle various aspects of QC.
  
  Workshop link: https://sites.google.com/view/qcai2025/

Related Research Highlights

Private, Secure, and Reliable Artificial Intelligence

Related Publications

Rashid, M.R.U., Liu, J., Koike-Akino, T., Mehnaz, S., Wang, Y., "Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage", Red Teaming GenAI Workshop at Neural Information Processing Systems (NeurIPS), December 2024.

BibTeX TR2024-168 PDF

@inproceedings{Rashid2024dec,
author = {Rashid, Md Rafi Ur and Liu, Jing and Koike-Akino, Toshiaki and Mehnaz, Shagufta and Wang, Ye},
title = {{Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage}},
booktitle = {Red Teaming GenAI Workshop at Neural Information Processing Systems (NeurIPS)},
year = 2024,
month = dec,
publisher = {OpenReview},
url = {https://www.merl.com/publications/TR2024-168}
}

Rashid, M.R.U., Liu, J., Koike-Akino, T., Mehnaz, S., Wang, Y., "Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage", arXiv, August 2024.

BibTeX arXiv

@article{Rashid2024aug,
author = {Rashid, Md Rafi Ur and Liu, Jing and Koike-Akino, Toshiaki and Mehnaz, Shagufta and Wang, Ye},
title = {{Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage}},
journal = {arXiv},
year = 2024,
month = aug,
url = {https://arxiv.org/abs/2408.17354}
}

MERL Contacts:

JingLiu

ToshiakiKoike-Akino

YeWang

Research Areas:

Abstract:

Jing
Liu

Toshiaki
Koike-Akino

Ye
Wang