trust_ai

General Context

Noise Evaluation Process (main_pipeline)

This section includes scripts for generating an AI-generated dataset, merging AI-generated and human_generated dataset, and launching an evaluation interface for evaluating human-generated and AI-generated responses in order to know if AI performed as good as humand.

Requirements

Python 3.x
Gradio library

Usage

Generating the Question-Ground Truth File

python extract.py -i <input_file.json> -o <output_file.json>

Example:

python extract.py -i human_dataset.json -o question_groundTruth_dataset.json

Generating the AI-Generated Dataset

python noise_pipeline.py -i <input_file.json> -o <output_file.json>

Example:

python noise_pipeline.py -i question_groundTruth_dataset.json -o noise.json

Merging the Datasets

python merge_datasets.py <human_dataset.json> <noise.json> <merged_dataset.json>

Example:

python merge_datasets.py human_dataset.json noise.json evaluation_dataset.json

Launching the Evaluation Interface

python evaluation_interface.py <evaluation_dataset_filename.json>

Example:

python evaluation_interface.py evaluation_dataset.json

Input and Output Formats

Input File Format for extract.py The input text file should contain questions and their corresponding ground truths and answers in the following format:

question: What are the main causes of climate change?
ground_truth: Climate change is primarily caused by human activities...
a0: The primary driver of climate change is human activity...
a1: The primary driver of climate change is human activity...
...

Output File Format for extract.py The output JSON file will have the following structure:

[
    {
        "question": "What are the main causes of climate change?",
        "ground_truth": "Climate change is primarily caused by human activities..."
    },
    {
        "question": "How does photosynthesis work in plants?",
        "ground_truth": "Photosynthesis is the process by which plants convert light energy..."
    }
]

Merged Dataset Format The merged dataset will have the following structure:

{
   "questions":[
      {
         "id":1,
         "question":"What was the Castlereagh–Canning duel?",
         "ground_truth":"The Castlereagh–Canning duel was a pistol duel...",
         "answers":{
            "A0":{
               "human":"The Castlereagh–Canning duel, fought on September 21, 1809...",
               "ai":"Climate change is predominantly attributed to human actions..."
            },
            ...
         }
      },
      ...
   ]
}

Testing Evaluation Directly

It is possible to test the evaluation interface directly with the dataset available in this repository through this command :

python evaluation_interface.py evaluation_dataset.json

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
adding_noise_process		adding_noise_process
llm_judge		llm_judge
main_pipeline		main_pipeline
.gitignore		.gitignore
DATASET - ADELE.txt		DATASET - ADELE.txt
LICENSE		LICENSE
README.md		README.md
dataset.txt		dataset.txt
final_noise_pipeline.py		final_noise_pipeline.py
main.py		main.py
prova.py		prova.py
ragas_AnswerCorrectness.py		ragas_AnswerCorrectness.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

trust_ai

General Context

Noise Evaluation Process (main_pipeline)

Table of Contents

Requirements

Usage

Generating the Question-Ground Truth File

Generating the AI-Generated Dataset

Merging the Datasets

Launching the Evaluation Interface

Input and Output Formats

Testing Evaluation Directly

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

SarraGha/trust_ai

Folders and files

Latest commit

History

Repository files navigation

trust_ai

General Context

Noise Evaluation Process (main_pipeline)

Table of Contents

Requirements

Usage

Generating the Question-Ground Truth File

Generating the AI-Generated Dataset

Merging the Datasets

Launching the Evaluation Interface

Input and Output Formats

Testing Evaluation Directly

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages