GitHub - ryuryukke/mint: A unified evaluation suite for membership inference and machine text detection.

"A unified evaluation suite for membership inference attacks
and machine-generated text detection."

Quick Start

Build environment (Python>=3.9):

$ git clone https://github.com/ryuryukke/mint.git
$ cd mint
$ python -m venv env
$ source env/bin/activate
$ pip install -r requirements.txt

Set your Hugging Face cache directory:

$ export HF_HOME=/path/to/huggingface_cache

Run evaluation on all methods for MIA:

$ python run.py --task mia --domain arxiv --methods all --model_name pythia-160m

Run evaluation on all methods for detection:

$ python run.py --task detection --domain wiki --methods all --model_name llama-chat

Please see more details for options in scripts/*.sh.

MINT Supports

We currently cover 4 common baselines, 7 state-of-the-art MIAs, and 5 state-of-the-art machine text detectors. Please submit an issue for more method support.

Methods	Category	Description	Identifier
Loss	Baselines	the likelihood of a target sample	`loss`
Entropy	Baselines	the expected likelihood of a target sample	`entropy`
Rank	Baselines	the average rank of the predicted token at each step	`rank`
LogRank	Baselines	the average log-rank of the predicted token at each step	`logrank`
Reference	MIA	the difference in the target loss between the model and another reference model	`ref`
Zlib	MIA	the ratio of the target loss and the zlib compression score of the target	`zlib`
Neighborhood	MIA	the difference between the target loss and the average loss over its perturbed samples	`neighborhood`
Min-K%	MIA	the average of log-likelihood of the $k$% tokens with lowest probabilities	`min_k`
Min-K%++	MIA	a standardized version of Min-K% over the model's vocabulary	`min_k_plus`
ReCaLL	MIA	the relative log-likelihood between a target sample and a set of non-member examples	`recall`
DC-PDD	MIA	the cross-entropy between the token likelihoods under the model and the laplace-smoothed unigram token frequency distribution under some reference corpus	`dc_pdd`
DetectGPT	Detection	the difference between the target loss and the average loss over its perturbed samples	`detectgpt`
Fast-DetectGPT	Detection	an efficient version of DetectGPT via fast-sampling technique and score normalization	`fastdetectgpt`
Binoculars	Detection	the ratio of the target perplexity to the cross entropy of the target sample under some reference model	`binoculars`
DetectLLM	Detection	a variant of DetectGPT instead of using LogRank as the core quantity	`detectllm`
Lastde++	Detection	the multi-scale diversity entropy measuring the local fluctuations in likelihood across a target text sequence	`lastde_doubleplus`

Datasets

We employ the MIMIR benchmark for MIAs and the RAID benchmark for detection.

Benchmark	Models	Domains
MIMIR	Pythia-160M, 1.4B, 2.8B, 6.7B, 12B	Wikipedia (knowledge), Pile CC (general web), PubMed Central and ArXiv (academic), HackerNews (dialogue), GitHub and DM Mathematical (technical)
RAID	GPT-2-XL, MPT-30B-Chat, LLaMA-2-70B-Chat, ChatGPT and GPT-4	Wikipedia and News (knowledge), Abstracts (academic), Recipes (instructions), Reddit (dialogue), Poetry (creative), Books (narrative), Reviews (opinions)

Running on a custom dataset

You can add a custom dataset by adding new if-else block to load_evaluation_data() in run.py.

Running a custom attack or detector

You can add a custom attack or detector by creating a new directory under methods/ and registering it in src/method.py. Please follow the shared format defined in src/method.py.

Citation

If you find our code or ideas useful in your research, please cite our work:

@misc{koike2025machinetextdetectorsmembership,
      title={Machine Text Detectors are Membership Inference Attacks}, 
      author={Ryuto Koike and Liam Dugan and Masahiro Kaneko and Chris Callison-Burch and Naoaki Okazaki},
      year={2025},
      eprint={2510.19492},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2510.19492}, 
}

Acknowledgements

This research is supported in part by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via the HIATUS Program contract #2022-22072200005. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of ODNI, IARPA, or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright annotation therein. These research results were also obtained from the commissioned research (No.22501) by National Institute of Information and Communications Technology (NICT), Japan. In addition, this work was supported by JST SPRING, Japan Grant Number JPMJSP2106.

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
assets		assets
data		data
methods		methods
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

"A unified evaluation suite for membership inference attacks
and machine-generated text detection."

Quick Start

MINT Supports

Datasets

Running on a custom dataset

Running a custom attack or detector

Citation

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

Uh oh!

License

Uh oh!

ryuryukke/mint

Folders and files

Latest commit

History

Repository files navigation

"A unified evaluation suite for membership inference attacksand machine-generated text detection."

Quick Start

MINT Supports

Datasets

Running on a custom dataset

Running a custom attack or detector

Citation

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

"A unified evaluation suite for membership inference attacks
and machine-generated text detection."

Packages