LLM Attribution: Challenges and Insights Across Model Stochasticity

Introduction

Large Language Models (LLMs) have ushered in a new era in artificial intelligence, enabling sophisticated natural language processing tasks and generating human-like text with unprecedented fluency. However, as the proliferation of AI-generated content increases, so does the challenge of attributing text to specific models—a critical concern for authorship verification, intellectual property rights, and mitigating misinformation.

Detailed in our accompanying paper, LLM Attribution, we investigate various approaches to discern textual characteristics that may indicate a specific LLM's signature. Through careful experimentation and analysis, we seek to understand the nuances between different models and the potential for reliable attribution.

This repository provides the complete codebase used in our research, including data processing scripts, model implementations, and evaluation tools. Our work seeks to shed light on the complexities of LLM attribution and to offer a foundation for further research in this critical area.

File Guide

Category	Description	Files
Dataset Generation	ChatGPT data generation	ChatGPT_Data_Generation.py ChatGPT_Data_Generation.ipynb
Dataset Generation	LLAMA data generation	llama_data_gen.ipynb
Dataset Generation	MISTRAL data generation	mistral_data_generation.ipynb
Data Preprocessing	Generate numerical features from text with embeddings	generate_word_embeddings.ipynb
Data Preprocessing	Generate numerical features by extracting stylometric features	stylometry_vector_gen.ipynb
Supervised Learning	Logistic Regression classification approach	Logistic_regression_final.ipynb
Supervised Learning	Random Forest classification approach	Clean_Random_Forest_model.ipynb
Unsupervised Learning	t-SNE analysis tools and code	misinfo_tsne.py misinfo_tsne.ipynb
Supervised Learning	Supervised Contrastive Learning (SCL) model generation	SCL.py
Supervised Learning	Data loader generation and data splitting	data_loader_gen.py
Supervised Learning	Analysis of SCL models	SCL_model_analysis.ipynb

Data Generation

ChatGPT Data Generation

To generate data using the ChatGPT model, use the ChatGPT_Data_Generation.ipynb notebook. This notebook handles batching of prompts and manages interactions with the OpenAI API.

Steps:

Setup:
- Install required dependencies:
```
pip install openai pandas requests
```
- Set your OpenAI API key in a secure location, such as a creds.py file:
```
OPENAI_KEY = 'your-api-key'
```
Prepare Prompts:
- Ensure prompts.csv is located in the data/ directory. This CSV file should contain the following columns:
  - prompt_for_generation: The actual text prompt.
  - hash: A unique identifier for each prompt.
  - type: The type of prompt (e.g., "rewrite", "paraphrase", "open_ended").

Configure Parameters:

Adjust settings in the configuration section of the script or notebook:

prompts_csv_path = "data/prompts.csv"
domain = "paraphrase"  # Choose from "rewrite", "paraphrase", or "open_ended"
model_temp = 0.7       # Temperature settings: 0.0, 0.7, 1.4
batch_size = 50        # Number of prompts per batch

Run the Script or Notebook:
- Execute the Python script directly:
```
python ChatGPT_Data_Generation.py
```
- Alternatively, open ChatGPT_Data_Generation.ipynb in Jupyter and run all cells sequentially.
Batch Processing Details:
- The prompts are processed in batches, with each batch sent asynchronously to the API. This helps manage rate limits and reduces the risk of timeouts.
- The script handles retries for failed requests, ensuring that all prompts are processed even if temporary network issues occur.
Retrieve and Process Results:
- The generated responses are saved in JSONL format and then converted to a CSV file for analysis. The output files include:
  - chatgpt_responses.jsonl: Raw API responses.
  - chatgpt_dataset.csv: Processed data ready for analysis, including columns for hash, temperature, prompt_type, and response_text.
Repeat for Different Settings:
- To generate data across various conditions, adjust the domain and model_temp parameters and rerun the script or notebook.
- This iterative approach allows for comprehensive data collection across different LLM configurations.

Notes:

Batches have their own, additional token limit which needs to be monitored.

LLAMA Data Generation

To generate data using the LLAMA model, use the llama_data_gen.ipynb notebook. This notebook handles batching of prompts and manages interactions with the LLAMA API.

Steps:

Setup Dependencies:
- Install required packages:
```
pip install llamaapi pandas
```
- Configure your API key securely:
```
api_key = 'your-api-key'
```
Prepare Input Data:
- Place prompts.csv in the data/ directory with the following columns:
  - prompt_for_generation: The input prompt text.
  - hash: A unique identifier for each prompt.
  - type: One of "rewrite", "paraphrase", or "open_ended".

Configure Parameters:

Set parameters in the notebook’s configuration cell:

prompts_csv_path = "data/prompts.csv"
model = 'llama3.1-70b'
temps = [0.0, 0.7, 1.4]  # Temperature settings
top_p = 0.9              # Top-p sampling parameter
batch_size = 50          # Number of prompts per batch

Run the Notebook:
- Open llama_data_gen.ipynb and execute all cells sequentially.
- The notebook processes prompts in batches, iterating through each temperature setting and sending asynchronous requests to the LLAMA API.
- The batching mechanism includes retry logic for any failed API calls, ensuring robust data generation.
Output Files:
- The responses are collected in a DataFrame and saved to llama_data.csv after each generation batch. The output file includes:
  - hash: Unique identifier for the prompt.
  - temp: Temperature setting used.
  - model: Model name (e.g., llama3.1-70b).
  - response_text: Generated text or error message.
  - datetime: Timestamp of the generation.
Repeat for Different Configurations:
- To explore different models or parameters, adjust the model and temps variables and rerun the notebook.
- This iterative process allows for comprehensive data collection under various conditions, aiding in diverse analysis scenarios.

Notes:

Monitor API rate limits when using large batch sizes to avoid throttling.
Logs are generated during execution to track progress and provide insights for troubleshooting any issues encountered during data generation.

Mistral Data Generation

To generate data using the Mistral model, use the mistral_data_generation.ipynb notebook. This notebook processes prompts and manages interactions with the Mistral AI API.

Steps:

Setup Dependencies:
- Install the required packages:
```
pip install mistralai pandas
```
- Set your Mistral AI API key securely:
```
api_key = 'your-api-key'
```
Prepare Input Data:
- Ensure prompts.csv is located in the data/ directory. It should contain the following columns:
  - prompt_for_generation: The input prompt text.
  - hash: A unique identifier for each prompt.
  - type: One of "rewrite", "paraphrase", or "open_ended".

Configure Parameters:

Edit the configuration in the notebook:

api_key = 'your-api-key'        # Mistral API key
model = 'mistral-large-2407'    # Model version
temperatures = [0.0, 0.7, 1.4]  # Temperature settings
batch_size = 50                 # Number of prompts per batch
dataset_name = "mistral_dataset_raw.csv"

Run the Notebook:
- Open mistral_data_generation.ipynb and execute all cells sequentially.
- The notebook processes prompts in batches, iterating over each temperature setting. It sends asynchronous API requests, handling retries automatically for failed calls.
Retrieve and Process Results:
- The responses are stored in mistral_dataset_raw.csv after each generation batch. The processed output is saved to mistral_ai_dataset.csv, containing:
  - hash: Unique identifier for each prompt.
  - temperature: Temperature used for generation.
  - model: Name of the model (e.g., mistral-large-2407).
  - response_text: The generated text response.
  - datetime: Timestamp of the generation.
- Post-processing steps include removing duplicate entries and cleaning the data:
```
clean_df = raw_df.drop_duplicates(subset=['temperature', 'hash'])
clean_df.to_csv("mistral_ai_dataset.csv", index=False)
```
Repeat for Different Settings:
- To generate data using different models or temperature configurations, update the model and temperatures variables in the notebook and rerun it.
- This allows you to collect diverse datasets under various conditions for comprehensive analysis.

Additional Notes:

Be mindful of API rate limits, especially when using high batch sizes.
Logs are created during execution to monitor the status of each batch and to help with troubleshooting any issues.

Summary of Results and Research Findings

Impact of Model Stochasticity:
- The temperature setting, which controls the randomness of LLM output, significantly affects the accuracy of authorship attribution. Higher temperatures (e.g., 1.4) introduce more variability in the generated text, making it harder to correctly identify the source model.
- Our analysis showed a decline in classifier performance as the temperature increased, indicating the difficulty of distinguishing LLMs based on more creative outputs.
Supervised Contrastive Learning (SCL) Evaluation:
- Despite using advanced Supervised Contrastive Learning techniques, the results did not show a marked improvement over traditional classifiers like Random Forest and Logistic Regression.
- The performance gap suggests that simpler models may be competitive when the input features include strong stylometric and embedding-based characteristics.
Unsupervised Learning and Clustering:
- We applied dimensionality reduction techniques (e.g., t-SNE) to explore the separability of LLM-generated texts. However, the embeddings did not exhibit meaningful clustering across different LLMs, especially as model sophistication increased.
- This lack of clear clustering suggests that advanced LLMs may converge in their stylistic outputs, complicating attribution tasks.
Dataset and Feature Engineering Insights:
- The dataset used included responses generated at different temperature settings across multiple LLMs. Short responses and incoherent outputs at high temperatures posed challenges for training and evaluation.
- We combined CLS embeddings from BERT with stylometric features to enhance the input representation. Ablation studies revealed that stylometric features alone did not significantly boost performance, indicating potential redundancy when combined with strong embedding features.
Comparison with Previous Work:
- Unlike prior studies, our findings did not demonstrate clear separability between LLMs using embedding-based approaches. This may be attributed to the use of more advanced, state-of-the-art models in our dataset, which appear to exhibit convergent behavior.
- The results indicate that distinguishing between more sophisticated LLMs is increasingly challenging, highlighting the need for novel methods that go beyond traditional embedding and stylometric analysis.
Ethical Considerations:
- While our methods aim to aid in detecting and attributing AI-generated text, they could also be misused to evade detection or obscure the source of generated content. Researchers and practitioners should exercise caution when deploying these models, ensuring that proper verification steps are taken.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Attribution: Challenges and Insights Across Model Stochasticity

Introduction

File Guide

Data Generation

ChatGPT Data Generation

LLAMA Data Generation

Mistral Data Generation

Summary of Results and Research Findings

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
data		data
.gitignore		.gitignore
BERT_experimentation.ipynb		BERT_experimentation.ipynb
ChatGPT_Data_Generation.ipynb		ChatGPT_Data_Generation.ipynb
ChatGPT_Data_Generation.py		ChatGPT_Data_Generation.py
Clean_Random_Forest_model.ipynb		Clean_Random_Forest_model.ipynb
LLM Attribution - Challenges and Insights Across Model Stochasticity.pdf		LLM Attribution - Challenges and Insights Across Model Stochasticity.pdf
Logistic_regression_final.ipynb		Logistic_regression_final.ipynb
README.md		README.md
SCL.py		SCL.py
SCL_model_analysis.ipynb		SCL_model_analysis.ipynb
Super_Contrastive_Loss.py		Super_Contrastive_Loss.py
build_SCL_models.ipynb		build_SCL_models.ipynb
data_loader_gen.py		data_loader_gen.py
llama_data_gen.ipynb		llama_data_gen.ipynb
misinfo_tsne.ipynb		misinfo_tsne.ipynb
misinfo_tsne.py		misinfo_tsne.py
mistral_data_generation.ipynb		mistral_data_generation.ipynb
model_analysis.ipynb		model_analysis.ipynb
stylometry_vector_gen.ipynb		stylometry_vector_gen.ipynb
utils.py		utils.py

aa-dank/x-domain-llm-classification

Folders and files

Latest commit

History

Repository files navigation

LLM Attribution: Challenges and Insights Across Model Stochasticity

Introduction

File Guide

Data Generation

ChatGPT Data Generation

LLAMA Data Generation

Mistral Data Generation

Summary of Results and Research Findings

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages