Value-Aligned Confabulation (VAC) Research

Overview

This repository contains the implementation of research on Value-Aligned Confabulation (VAC) - a novel approach to evaluating LLM outputs that distinguishes between harmful "hallucination" and beneficial "hallucination" that aligns with human values.

Core Concept

Traditional LLM evaluation treats all factually ungrounded outputs as equally problematic "hallucinations." VAC research proposes that:

Harmful Hallucination: Factually incorrect outputs that mislead or cause harm
Value-Aligned Confabulation: LLM outputs that are factually ungrounded but align with human values and serve beneficial purposes
Truthfulness-Utility Trade-off: The balance between factual accuracy and beneficial outcomes

Key Research Questions

Can LLMs learn to confabulate in ways that align with human values?
How do we measure the alignment between beneficial confabulation and truthfulness?
What contextual factors determine when confabulation becomes harmful vs. helpful?

Repository Structure

value-aligned-confabulation/
├── docs/                    # Research documentation
├── src/                     # Core implementation
│   ├── evaluation/         # Evaluation framework
│   ├── data/               # Data collection and management
│   ├── models/             # Model implementations
│   └── analysis/           # Analysis tools
├── experiments/            # Experimental protocols
├── tests/                  # Testing framework
├── configs/                # Configuration files
└── scripts/                # Utility scripts

Installation

pip install -r requirements.txt
python setup.py install

Quick Start

from src.evaluation.vac_evaluator import ValueAlignedConfabulationEvaluator

evaluator = ValueAlignedConfabulationEvaluator()
score = evaluator.evaluate_response(prompt, response, context)

Web UI (Streamlit) for Value Elicitation

Prefer a friendlier interface? Launch the Streamlit app:

# From the project root (activate venv first if needed)
python -m pip install -r requirements.txt
streamlit run experiments\pilot_studies\streamlit_app.py

The app collects demographics, shows scenario pairs with styled cards, and saves:

JSON bundle with analysis
JSONL rows (one per recorded choice)
CSV table

Files are written to experiments/results/value-elicitation_streamlit/<DATE>/.

Research Phases

Phase 1: Foundation (Weeks 1-2)

Core evaluation framework
Initial benchmark scenarios
Basic metrics implementation

Phase 2: Human Studies (Weeks 3-4)

Value elicitation study
Expert judgment collection
Baseline human preferences

Phase 3: Model Evaluation (Weeks 5-6)

Baseline model evaluation
Cross-domain testing
Alignment-truthfulness trade-off analysis

Phase 4: Analysis & Iteration (Weeks 7-8)

Statistical analysis
Metric refinement
Research publication preparation

Contributing

This is a research project focused on advancing our understanding of beneficial AI confabulation. We welcome contributions from researchers, developers, and AI safety practitioners.

Ways to Contribute

Research: New evaluation metrics, benchmark scenarios, human study protocols
Technical: Code improvements, integrations, analysis tools
Documentation: Methodology improvements, examples, tutorials
Community: Cross-cultural validation, expert reviews, ethical guidelines

Please see our Contributing Guide for detailed information on how to get involved.

Research Ethics

This project follows ethical guidelines for human subjects research and AI safety. All contributions should consider potential societal impacts and promote beneficial uses of confabulation research.

Acknowledgements

This research builds upon important insights from the AI research community:

Terminology

Geoffrey Hinton has advocated for using "confabulation" rather than "hallucination" when describing AI-generated content that isn't grounded in training data, emphasizing that the term better captures the nature of how language models generate responses. See his discussion in the 60 Minutes interview and the full interview.
Andrej Karpathy has discussed the nuanced nature of what we call "hallucinations" in language models, noting that not all factually ungrounded outputs are equally problematic - a key insight that motivates this research. His thoughts on this topic have been shared in various Twitter/X discussions.

Foundational Work

This research was originally conceptualized in "Hallucinations in Large Language Models" (Ashioya, 2024), which explored the need for more nuanced evaluation of AI-generated content.

Research Community

We acknowledge the broader AI safety and alignment research community, whose ongoing work on AI evaluation, human preference modeling, and value alignment provides the foundation for this research.

License

MIT License - See LICENSE file for details.

Citation

If you use this work in your research, please cite:

@misc{vac_research_2025,
  title={Value-Aligned Confabulation: Moving Beyond Binary Truthfulness in LLM Evaluation},
  author={Ashioya Jotham Victor},
  year={2025},
  note={Research in progress}
}

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
docs		docs
experiments		experiments
src		src
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
GETTING_STARTED.md		GETTING_STARTED.md
LICENSE		LICENSE
README.md		README.md
VAC.pdf		VAC.pdf
demo_vac_framework.py		demo_vac_framework.py
requirements.txt		requirements.txt
setup.py		setup.py
value_aligned_confabulation_repo.md		value_aligned_confabulation_repo.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Value-Aligned Confabulation (VAC) Research

Overview

Core Concept

Key Research Questions

Repository Structure

Installation

Quick Start

Web UI (Streamlit) for Value Elicitation

Research Phases

Phase 1: Foundation (Weeks 1-2)

Phase 2: Human Studies (Weeks 3-4)

Phase 3: Model Evaluation (Weeks 5-6)

Phase 4: Analysis & Iteration (Weeks 7-8)

Contributing

Ways to Contribute

Research Ethics

Acknowledgements

Terminology

Foundational Work

Research Community

License

Citation

About

Uh oh!

Releases

Packages

Languages

License

ashioyajotham/Value-Aligned-Confabulation-VAC-Research

Folders and files

Latest commit

History

Repository files navigation

Value-Aligned Confabulation (VAC) Research

Overview

Core Concept

Key Research Questions

Repository Structure

Installation

Quick Start

Web UI (Streamlit) for Value Elicitation

Research Phases

Phase 1: Foundation (Weeks 1-2)

Phase 2: Human Studies (Weeks 3-4)

Phase 3: Model Evaluation (Weeks 5-6)

Phase 4: Analysis & Iteration (Weeks 7-8)

Contributing

Ways to Contribute

Research Ethics

Acknowledgements

Terminology

Foundational Work

Research Community

License

Citation

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages