Skip to content
/ mint Public

A unified evaluation suite for membership inference and machine text detection.

License

ryuryukke/mint

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MINT

"A unified evaluation suite for membership inference attacks
and machine-generated text detection."

Quick Start

Build environment (Python>=3.9):

$ git clone https://github.com/ryuryukke/mint.git
$ cd mint
$ python -m venv env
$ source env/bin/activate
$ pip install -r requirements.txt

Set your Hugging Face cache directory:

$ export HF_HOME=/path/to/huggingface_cache

Run evaluation on all methods for MIA:

$ python run.py --task mia --domain arxiv --methods all --model_name pythia-160m

Run evaluation on all methods for detection:

$ python run.py --task detection --domain wiki --methods all --model_name llama-chat

Please see more details for options in scripts/*.sh.

MINT Supports

We currently cover 4 common baselines, 7 state-of-the-art MIAs, and 5 state-of-the-art machine text detectors. Please submit an issue for more method support.

Methods Category Description Identifier
Loss Baselines the likelihood of a target sample loss
Entropy Baselines the expected likelihood of a target sample entropy
Rank Baselines the average rank of the predicted token at each step rank
LogRank Baselines the average log-rank of the predicted token at each step logrank
Reference MIA the difference in the target loss between the model and another reference model ref
Zlib MIA the ratio of the target loss and the zlib compression score of the target zlib
Neighborhood MIA the difference between the target loss and the average loss over its perturbed samples neighborhood
Min-K% MIA the average of log-likelihood of the $k$% tokens with lowest probabilities min_k
Min-K%++ MIA a standardized version of Min-K% over the model's vocabulary min_k_plus
ReCaLL MIA the relative log-likelihood between a target sample and a set of non-member examples recall
DC-PDD MIA the cross-entropy between the token likelihoods under the model and the laplace-smoothed unigram token frequency distribution under some reference corpus dc_pdd
DetectGPT Detection the difference between the target loss and the average loss over its perturbed samples detectgpt
Fast-DetectGPT Detection an efficient version of DetectGPT via fast-sampling technique and score normalization fastdetectgpt
Binoculars Detection the ratio of the target perplexity to the cross entropy of the target sample under some reference model binoculars
DetectLLM Detection a variant of DetectGPT instead of using LogRank as the core quantity detectllm
Lastde++ Detection the multi-scale diversity entropy measuring the local fluctuations in likelihood across a target text sequence lastde_doubleplus

Datasets

We employ the MIMIR benchmark for MIAs and the RAID benchmark for detection.

Benchmark Models Domains
MIMIR Pythia-160M, 1.4B, 2.8B, 6.7B, 12B Wikipedia (knowledge), Pile CC (general web), PubMed Central and ArXiv (academic), HackerNews (dialogue), GitHub and DM Mathematical (technical)
RAID GPT-2-XL, MPT-30B-Chat, LLaMA-2-70B-Chat, ChatGPT and GPT-4 Wikipedia and News (knowledge), Abstracts (academic), Recipes (instructions), Reddit (dialogue), Poetry (creative), Books (narrative), Reviews (opinions)

Running on a custom dataset

You can add a custom dataset by adding new if-else block to load_evaluation_data() in run.py.

Running a custom attack or detector

You can add a custom attack or detector by creating a new directory under methods/ and registering it in src/method.py. Please follow the shared format defined in src/method.py.

Citation

If you find our code or ideas useful in your research, please cite our work:

@misc{koike2025machinetextdetectorsmembership,
      title={Machine Text Detectors are Membership Inference Attacks}, 
      author={Ryuto Koike and Liam Dugan and Masahiro Kaneko and Chris Callison-Burch and Naoaki Okazaki},
      year={2025},
      eprint={2510.19492},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2510.19492}, 
}

Acknowledgements

This research is supported in part by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via the HIATUS Program contract #2022-22072200005. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of ODNI, IARPA, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright annotation therein. These research results were also obtained from the commissioned research (No.22501) by National Institute of Information and Communications Technology (NICT), Japan. In addition, this work was supported by JST SPRING, Japan Grant Number JPMJSP2106.