Value-Alignment-Agentic-AI-Papers-Survey-Taxonomy

This repository is created to support the survey paper Application-Driven Value Alignment in Agentic AI Systems: Survey and Perspectives, by collecting and categorizing relevant research papers and datasets on value alignment in agentic AI systems.

We welcome contributions, discussions, and issues related to value alignment for agentic AI. If you have any questions, feel free to contact Zengwei_hnu@163.com. (We recommend cc'ing zhuhengshu@gmail.com as a precaution in case of any delivery issues.)

We will continue to update both the arXiv paper and this repository regularly. If you find our survey useful for your research, please cite the following paper:

@article{AgenticAIValueAlignment,
  title={Application-Driven Value Alignment in Agentic AI Systems: Survey and Perspectives},
  author={Zeng, Wei and Zhu, Hengshu and Qin, Chuan and Wu, Han and Cheng, Yihang and Zhang, Sirui and Jin, Xiaowei and Shen, Yinuo and Wang, Zhenxing and Zhong, Feimin and Xiong, Hui},
  journal={arXiv preprint arXiv:2506.09656},
  year={2025}
}

Bookmarks

Overview of Our Survey
Related Survey
The Principles of Values Alignment
Agent System Application
- High Generaliability
- Limited Generaliability
Values Alignment Evaluation for Agent Systems
Methodologies for Agent Value Alignment
Datasets
Future Directions

Papers

Overview of Our Survey

Related Survey

Time	Title	Keywords	Venue
2025	The rise and potential of large language model based agents: a survey	Communication structures, practical applications and societal systems etc. of LLM-based agents	Science China Information Sciences
2025	A Survey on Alignment for Large Language Model Agents	Value alignment objectives, datasets, techniques, and evaluation methods for LLM-based agents	Openreview
2025	Multi-Agent Collaboration Mechanisms: A Survey of LLMs	Conceptual framework, interaction mechanisms, and application overview of LLM-based agent systems	arXiv
2024	Large Language Model based Multi-Agents: A Survey of Progress and Challenges	Capabilities, framework Analysis, and ppplication overview of LLM-based multi-agent systems	arXiv
2024	A survey on large language model based autonomous agents	Constituent modules, application overview, and evaluation methods of LLM-based autonomous agents:	Frontiers of Computer Science
2023	AI Alignment: A Comprehensive Survey	Motivations and objectives, alignment methods, and assurance and governance of AI alignment	arXiv
2023	From Instructions to Intrinsic Human Values -- A Survey of Alignment Goals for Big Models	Definition and evaluation of LLM alignment objectives	arXiv
2023	Large Language Model Alignment: A Survey	Definition, categories, testing, and evaluation of LLM alignment	arXiv
2023	Unpacking the Ethical Value Alignment in Big Models	Definition, normative principles, and technical methods of LLM value alignment	arXiv
2023	Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment	Alignment objectives for trustworthy LLMs	arXiv
2024	Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions	Challenges, fundamental definitions, and alignment frameworks of LLM value alignment	arXiv
2025	Value alignment in ai large models: Current status, key issues, and normative strategies	Necessity, conceptual definitions, theoretical approaches, challenges and future outlook of LLM value alignment	CNKI

The Principles of Values Alignment

The Macro Level Principles of Values Alignment

Sub-Level	Sub-sub-Level	Title	Time	Venue
Moral Foundation	Beneficience	The global landscape of ai ethics guidelines	2019	Nature machine intelligence
	Beneficience	A unified framework of five principles for ai in society	2022	Machine learning and the city: Applications inarchitecture and urban design
	Justice & Fairness	The global landscape of ai ethics guidelines	2019	Nature machine intelligence
		Ethics of ai: A systematic literature review of principles and challenges	2022	Proceedings of the 26th international conference on evaluation andassessment in software engineering
		A unified framework of five principles for ai in society	2022	Machine learning and the city: Applications inarchitecture and urban design
		Towards realistic evaluation of cultural value alignment in large language models: Diversity enhancement for survey response simulation	2025	Information Processing & Management
		Achieving fairness in multi-agent mdp using reinforcement learning	2023	The Twelfth InternationalConference on Learning Representations
		Beavertails: Towards improved safety alignment of llm via a human-preference dataset	2023	Advances inNeural Information Processing Systems
	Honesty	From instructions to intrinsic human values - A survey of alignment goals for big models	2023	arXiv
	Honesty	Improving factuality and reasoning in language models through multiagent debate	2023	Forty-first International Conference on Machine Learning
	Responsibility	The global landscape of ai ethics guidelines	2019	Nature machine intelligence
	Responsibility	Ethics of ai: A systematic literature review of principles and challenges	2022	Proceedings of the 26th international conference on evaluation andassessment in software engineering
	Virtue	Dailydilemmas: Revealing value preferences of llms with quandaries of daily life	2025	The ThirteenthInternational Conference on Learning Representations, ICLR 2025,Singapore
	Virtue	Value compass leaderboard: A platform for fundamental and validated evaluation of llms values	2025	arXiv
	Dignity	The global landscape of ai ethics guidelines	2019	Nature machine intelligence
	Dignity	Shaping the ethical governance path ofartificial intelligence in the chinese context—based on value-instrument rationality	2025	Studies in Science of Science
Rights Protection	Freedom & Autonomy	The global landscape of ai ethics guidelines	2019	Nature machine intelligence
		A unified framework of five principles for ai in society	2022	Machine learning and the city: Applications inarchitecture and urban design
		Shaping the ethical governance path ofartificial intelligence in the chinese context—based on value-instrument rationality	2025	Studies in Science of Science
	Privacy	The global landscape of ai ethics guidelines	2019	Nature machine intelligence
		Ethics of ai: A systematic literature review of principles and challenges	2022	Proceedings of the 26th international conference on evaluation andassessment in software engineering
		Hydragan: A cooperative agent model for multi-objective data generation	2024	ACM Trans. Intell.Syst. Technol.
		Empowering users in digital privacy management through interactive llm-based agents	2025	TheThirteenth International Conference on Learning Representations, ICLR2025, Singapore
Sustainability	Solidarity	The global landscape of ai ethics guidelines	2019	Nature machine intelligence
	Sustainability	The global landscape of ai ethics guidelines	2019	Nature machine intelligence
	Sustainability	Cooperate or collapse: Emergence of sustainable cooperation in a society of LLM agents	2024	Advances in Neural Information Processing Systems 38: AnnualConference on Neural Information Processing Systems 2024,NeurIPS 2024
System Governance	Harmlessness	The global landscape of ai ethics guidelines	2019	Nature machine intelligence
		A unified framework of five principles for ai in society	2022	Machine learning and the city: Applications inarchitecture and urban design
		From instructions to intrinsic human values - A survey of alignment goals for big models	2023	arXiv
		Training socially aligned language models on simulated social interactions	2024	The Twelfth International Conferenceon Learning Representations, ICLR 2024
		Safesora: Towards safety alignment of text2video generation via a human preference dataset	2024	Advances in Neural InformationProcessing Systems
	Trust	The global landscape of ai ethics guidelines	2019	Nature machine intelligence
	Trust	Towards realistic evaluation of cultural value alignment in large language models: Diversity enhancement for survey response simulation	2025	Information Processing & Management
	Transparency	The global landscape of ai ethics guidelines	2019	Nature machine intelligence
		A unified framework of five principles for ai in society	2022	Machine learning and the city: Applications inarchitecture and urban design
		Ethics of ai: A systematic literature review of principles and challenges	2022	Proceedings of the 26th international conference on evaluation andassessment in software engineering
		ICA-CRMAS: intelligent context-awareness approach for citation recommendation based on multi-agent system	2024	ACM Trans. Manag. Inf. Syst.
		A multimodal automated interpretability agent	2024	Forty-first InternationalConference on Machine Learning, ICML 2024
		Biodiscoveryagent: An ai agent for designing genetic perturbation experiments	2024	arXiv
	Usefulness	From instructions to intrinsic human values - A survey of alignment goals for big models	2023	arXiv
	Usefulness	Dailydilemmas: Revealing value preferences of llms with quandaries of daily life	2025	The ThirteenthInternational Conference on Learning Representations, ICLR 2025

The Meso Level Principles of Values Alignment

Sub-Level	Sub-sub-level	Title	Time	Venue
National Level	The United States	Unpacking the ethical value alignment in big models	2022	arXiv
		The global landscape of ai ethics guidelines	2019	Nature machine intelligence
		Blueprint for an ai bill of rights: Making automated systems work for the american people	2022	White House Nimble Books
		The national artificial intelligence research and development strategic plan: 2023 update	2023	National Science and Technology Council (US)
	European Union	Ethics guidelines for trustworthy AI	2019	High-level expert group on artificial intelligence
		EU Artificial intelligence act	2024	Regulamento da União Europeia (UE)
		Unpacking the ethical value alignment in big models	2022	arXiv
		The global landscape of ai ethics guidelines	2019	Nature machine intelligence
	China	Measures for the ethical review of science and technology	2023	C. A. of Sciences
		Standardization of ai ethical governance	2023	National Artificial Intelligence StandardizationGeneral Group
		Unpacking the ethical value alignment in big models	2022	arXiv
Industry Level	Education	Guidance for generative AI in education and research	2023	UNESCO Publishing
	Health	Ethics and governance of artificial intelligence for health: Guidance on large multi-modal models	2024	W. H. Organization
	Gaming	The ethics of ai in games	2023	IEEE Transactions onAffective Computing
Cultural Level	Large/Small Power Distance Strong/Weak Uncertainty Avoidance Individual/Collectivism	Dimensionalizing cultures: The hofstede model in context	2011	Online readings in psychology and culture
		Cdeval: A benchmark for measuring the cultural dimensions of large language models	2023	arXiv
		How well do llms represent values across cultures? empirical analysis of llm responses based on hofstede cultural dimensions	2024	arXiv
		Llm-globe: A benchmark evaluating the cultural values embedded in llm output	2024	arXiv
	Long/Short-Term Orientation Masculinity/Femininity	Dimensionalizing cultures: The hofstede model in context	2011	Online readings in psychology and culture
		Cdeval: A benchmark for measuring the cultural dimensions of large language models	2023	arXiv
		How well do llms represent values across cultures? empirical analysis of llm responses based on hofstede cultural dimensions	2024	arXiv
	Indulgence/Restrained	Dimensionalizing cultures: The hofstede model in context	2011	Online readings in psychology and culture
	Indulgence/Restrained	Cdeval: A benchmark for measuring the cultural dimensions of large language models	2023	arXiv

The Micro Level Principles of Values Alignment

Sub-Level	Title	Time	Venue
Recruiment	Recruitment in the times of machine learning	2019	Management Systems in Production Engineering
	HR analytics and ethics	2019	IBM Journal of Research andDevelopment
	Ethics of ai-enabled re-cruiting and selection: A review and research agenda	2022	Journal ofBusiness Ethics
Legal Consultation	Lawluo: A chinese law firm co-run by llm agents	2024	arXiv
Pharmaceutical Company Governance	Operationalising ai governance through ethics-based auditing: an industry case study	2023	AI andEthics

Agent System Application

High Generaliability

Generalizability	Category	Title	Time	Venue
High	Tool Using	Multi-modal agent tuning: Building avlm-driven agent for efficient tool usage	2025	ICLR 2025
		Agent smith: A single image can jailbreak one million multimodal LLM agents exponentially fast	2024	ICML 2024
		GTA: A benchmark for general tool agents	2024	Advances in Neural Information Processing Systems
		Injecagent: Benchmarking indirect prompt injections in tool-integrated large language model agents	2024	ACL 2024
		Spider2-v: How far are multimodal agents from automating data science and engineering workflows?	2024	Advances in Neural Information Processing Systems
	Dialugue Agents	Please donate to save a life: Inducing politeness to handle resistance in persuasive dialogue agents	2024	IEEE ACM Trans. Audio Speech Lang. Process
		Archer: Training language model agents via hierarchical multi-turn RL	2024	ICML 2024
		Inside out: Emotional multiagent multimodal dialogue systems	2024	IJCAI 2024
		KEEP chatting! an attractive dataset for continuous conversation agents	2024	ACL 2024
		Secom: On memory construction and retrieval for personalized conversational agents	2025	ICLR 2025
		STYLE: improving domain transferability of asking clarification questions in large language model powered conversationalagents	2024	ACL 2024
		Talk with human-like agents: Empathetic dialogue through perceptible acoustic reception and reaction	2024	ACL 2024
		AGILE: A novel reinforcement learning framework of LLM agents	2024	NeurIPS 2024
		Intelligent agents with llm-based process automation	2024	KDD 2024
		Incharacter: Evaluating personality fidelity in role-playing agents through psychological interviews	2024	ACL 2024
		Plug and-play policy planner for large language model powered dialogue agents	2024	ICLR 2024
		Probing the uniquely identifiable linguistic patterns of conversational AI agents	2024	ACL 2024
		Socialbench: Sociality evaluation of role-playing conversational agents	2024	ACL 2024
		Speaker verification in agent-generated conversations	2024	ACL 2024
	Multimodal Perception	MLLM as retriever: Interactively learning multimodal retrieval for embodied agents	2025	ICLR 2025
		Multi-modal agent tuning: Building a vlm-driven agent for efficient tool usage	2025	ICLR 2025
		Agent smith: A single image can jailbreak one million multimodal LLM agents exponentially fast	2024	ICML 2024
		Doraemongpt: Toward understanding dynamic scenes with large language models (exemplified as A video agent)	2024	ICML 2024
	Multi-Agent Collaboration	Adasociety: An adaptive environment with social structures for multi-agent decision-making	2024	NeurIPS 2024
		Exploring collaboration mechanisms for LLM agents: A social psychology view	2024	ACL 2024
		Maximum entropy heterogeneous-agent reinforcement learning	2024	ICLR 2024
		Fuzzy feedback multiagent reinforcement learning for adversarial dynamic multiteam competitions	2024	IEEE Trans. Fuzzy Syst.
		Entity divider with language grounding in multi-agent reinforcement learning	2023	ICML 2023
		Collaborative dynamic scheduling in a self-organizing manufacturing system using multi-agent reinforcement learning	2024	Adv. Eng. Informatics
		Experiential co-learning of software-developing agents	2024	ACL 2024
		Swe-agent: Agent-computer interfaces enable automated software engineering	2024	NeurIPS 2024
		Learning to cooperate with humans using generative agents	2024	NeurIPS 2024
		Long-horizon planning for multi-agent robots in partially observable environments	2024	NeurIPS 2024
		360°rea: Towards A reusable experience accumulation with 360° assessment for multi-agent system	2024	ACL 2024
	Visual Tasks Processing	Cobjeason: Reasoning covered object in image by multi-agent collaboration based on informed knowledge graph	2024	ACM Trans. Knowl. Discov. Data
		Genartist: Multimodal LLM as an agent for unified image generation and editing	2024	NeurIPS 2024
		Restoreagent: Autonomous image restoration agent via multimodal large language models	2024	NeurIPS 2024
		Scenecraft: An LLM agent for synthesizing 3d scenes as blender code	2024	ICML 2024
	Natural Language Processing	Kgagent: Learning a deep reinforced agent for keyphrase generation	2024	IEEE ACM Trans. Audio Speech Lang. Process.
		A human inspired reading agent with gist memory of very long contexts	2024	ICML 2024
		Text2db: Integration aware information extraction with large language model agents	2024	ACL 2024
		Optimizing text-to-sql conversion techniques through the integration of intelligent agents and large language models	2025	Inf. Process. Manag.
		Xmc-agent : Dynamic navigation over scalable hierarchical index for incremental extreme multi-label classification	2024	ACL 2024
		A two-agent game for zero-shot relation triplet extraction	2024	ACL 2024
	Data Generation and Analysis	Transagent: Transfer vision-language foundation models with heterogeneous agent collaboration	2024	NeurIPS 2024
	Reasoning, Planning and Decision Optimization	AVIS: autonomous visual information seeking with large language model agent	2023	NeurIPS 2023
		Sociodojo: Building lifelong analytical agents with real-world text and time series	2024	ICLR 2024
		Agent instructs large language models to be general zero-shot reasoners	2024	ICML 2024
		Agent planning with world knowledge model	2024	NeurIPS 2024
		Automanual: Constructing instruction manuals by LLM agents via interactive environmental learning	2024	NeurIPS 2024
		Improving factuality and reasoning in language models through multiagent debate	2024	ICML 2024
		Magdi: Structured distillation of multi-agent interaction graphs improves reasoning in smaller language models	2024	ICML 2024
		Social learning through interactions with other agents: A survey	2024	IJCAI 2024
		Online continual learning for interactive instruction following agents	2024	ICLR 2024
		Travelplanner: A benchmark for real-world planning with language agents	2024	ICML 2024
		Trial and error: Exploration-based trajectory optimization of LLM agents	2024	ACL 2024
		Distilling internet-scale vision-language models into embodied agents	2023	ICML 2023
		Grounded decoding: Guiding text generation with grounded models for embodied agents	2023	NeurIPS 2023
		Robotouille: An asynchronous planning benchmark for LLM agents	2025	ICLR 2025
		Language agents with reinforcement learning for strategic play in the werewolf game	2024	ICML 2024
		Omnijarvis: Unified vision language-action tokenization enables open-world instruction following agents	2024	NeurIPS 2024
		Understanding the weakness of large language model agents within a complex android environment	2024	KDD 2024
		Embodied agent interface: Benchmarking llms for embodied decision making	2024	NeurIPS 2024
		Tool learning in the wild: Empowering language models as automatic tool agents	2025	WWW 2025
Swiftsage: A generative agent with fast and slow thinking for complex interactive tasks		2023	NeurIPS 2023
On the effects of data scale on UI control agents		2024	NeurIPS 2024
Rada: Retrieval-augmented web agent planning with llms		2024	ACL 2024
Division-of-thoughts: Harnessing hybrid language model synergy for efficient on-device agents		2025	WWW 2025

Limited Generaliability

Generalizability	Category		Title	Time	Venue
Limited	Function- driven	Code Generation	Mapcoder: Multi-agent code generation for competitive problem solving	2024	ACL 2024
			Tailoring with targeted precision: Edit-based agents for open-domain procedure customization	2024	ACL 2024
			Worldcoder, a model-based LLM agent: Building world models by writing code and interacting with the environment	2024	NeurIPS 2024
			Swt-bench: Testing and validating real-world bug-fixes with code agents	2024	NeurIPS 2024
		Social Simulation	User behavior simulation with large language model-based agents	2025	ACM Trans. Inf. Syst.
			An llm-enhanced agent-based simulation tool for information propagation	2024	IJCAI 2024
			Can large language model agents simulate human trust behavior?	2024	NeurIPS 2024
			Competeai: Understanding the competition dynamics of large language model-based agents	2024	ICML 2024
			Cooperate or collapse: Emergence of sustainable cooperation in a society of LLM agents	2024	NeurIPS 2024
			Richelieu: Self-evolving llm-based agents for AI diplomacy	2024	NeurIPS 2024
			SOTOPIA: interactive evaluation for social intelligence in language agents	2024	ICLR 2024
			Towards objectively benchmarking social intelligence of language agents at the action level	2024	ACL 2024
			CAMEL: communicative agents for ""mind"" exploration of large language model society	2023	NeurIPS 2023
			Roleagent: Building, interacting, and benchmarking high-quality role-playing agents from scripts	2024	NeurIPS 2024
			Villageragent: A graph-based multi-agent framework for coordinating complex task dependencies in minecraft	2024	ACL 2024
			Describe, explain, plan and select: Interactive planning with llms enables open-world multi-task agents	2023	NeurIPS 2023
			Do embodied agents dream of pixelated sheep: Embodied decision making using language guided world modelling	2023	ICML 2023
			Sotopia-π: Interactive learning of socially intelligent language agents	2024	ACL 2024
			Timearena: Shaping efficient multitasking language agents in a time-aware simulation	2024	ACL 2024
		Graphical User Interface Agents	On the multi-turn instruction following for conversational web agents	2024	ACL 2024
			Seeclick: Harnessing GUI grounding for advanced visual GUI agents	2024	ACL 2024
			You only look at screens: Multimodal chain-of-action agents	2024	ACL 2024
			A real-world webagent with planning, long context understanding, and program synthesis	2024	ICLR 2024
			Agentoccam: A simple yet strong baseline for llm-based web agents	2025	ICLR 2025
			Synatra: Turning indirect knowledge into direct demonstrations for digital agents at scale,	2024	NeurIPS 2024
			Agent S: an open agentic framework that uses computers like a human	2025	ICLR 2025
			Agenttrek: Agent trajectory synthesis via guiding replay with web tutorials	2025	ICLR 2025
			Visualwebarena: Evaluating multimodal agents on realistic visual web tasks	2024	ACL 2024
			Webarena: A realistic web environment for building autonomous agent	2024	ICLR 2024
			Webvoyager: Building an end-to-end web agent with large multimodal models	2024	ACL 2024
			Workarena: How capable are web agents at solving common knowledge work tasks?	2024	ICML 2024
			Large language models empowered personalized web agents	2025	WWW 2025
			Screenagent: A vision language model-driven computer control agent	2024	IJCAI 2024
			Autowebglm: A large language model-based web navigating agent	2024	ACM, 2024
			Coco-agent: A comprehensive cognitive MLLM agent for smartphone GUI automation	2024	ACL 2024
			Autoguide: Automated generation and selection of context-aware guidelines for large language model agents	2024	NeurIPS 2024
			adversarial robustness of multimodal LM agents	2025	ICLR 2025
			Webrl: Training LLM web agents via self-evolving online curriculum reinforcement learning	2025	ICLR 2025
		Model Analysis, Evaluation and Improvement	Dynamic evaluation of large language models by meta probing agents	2024	ICML 2024
			A multimodal automated interpretability agent	2024	ICML 2024
			Ali-agent: Assessing llms’ alignment with human values via agent-based evaluation	2024	NeurIPS 2024
			Chateval: Towards better llm-based evaluators through multi-agent debate	2024	ICLR 2024
			Cheatagent: Attacking llm-empowered recommender systems via LLM agent	2025	arXiv
			Star-agents: Automatic data optimization with LLM agents for instruction tuning	2024	NeurIPS 2024
	Scenario- driven	Finance	A multimodal foundation agent for financial trading: Tool augmented, diversified, and generalist	2024	KDD 2024
			Econagent: Large language model-empowered agents for simulating macroeconomic activities	2024	ACL 2024
			Fincon: A synthesized LLM multi-agent system with conceptual verbal reinforcement for enhanced financial decision making	2024	NeurIPS 2024
			Designing heterogeneous LLM agents for financial sentiment analysis	2025	ACM Trans. Manag. Inf. Syst.
		Science	Ds-agent: Automated data science by empowering large language models with case-based reasoning	2024	ICML 2024
			ICA-CRMAS: intelligent context-awareness approach for citation recommendation based on multi-agent system	2024	ACM Trans. Manag. Inf. Syst.
			Matplotagent: Method and evaluation for llm-based agentic scientific data visualization	2024	ACL 2024
			OSDA agent: Leveraging large language models for de novo design of organic structure directing agents	2025	ICLR 2025
			Scienceagentbench: Toward rigorous assessment of language agents for data-driven scientific discovery	2025	ICLR 2025
		Healthcare	Medagents: Large language models as collaborators for zero-shot medical reasoning	2024	ACL 2024
			Pathgen-1.6m: 1.6 million pathology image-text pairs generation through multi-agent collaboration,	2025	ICLR 2025
			Psychogat: A novel psychological measurement paradigm through interactive fiction games with LLM agents	2024	ACL 2024
			De novo drug design using reinforcement learning with multiple GPT agents	2024	arXiv
			Biodiscoveryagent: An AI agent for designing genetic perturbation experiments	2025	ICLR 2025
			Colacare: Enhancing electronic health record modeling through large language model-driven multi-agent collaboration	2025	ACM 2025
			Spokenwoz: A large-scale speech-text benchmark for spoken task-oriented dialogue agents	2023	NeurIPS 2023
		Game	Advancing DRL agents in commercial fighting games: Training, integration, and agent-human alignment	2024	ICML 2024
			Deciphering digital detectives: Understanding LLM behaviors and capabilities in multi-agent mystery games	2024	ACL 2024
			STARLING: self-supervised training of text-based reinforcement learning agent with large language models,	2024	ACL 2024
			Whodunitbench: Evaluating large multimodal agents via murder mystery games	2024	NeurIPS 2024
Monte carlo planning with large language model for text-based game agents			2025	ICLR 2025
Manufacturing		Monte carlo planning with large language model for text-based game agents	2025	ICLR 2025
Robotics		Mobile-agent-v2: Mobile device operation assistant with effective navigation via multi-agent collaboration	2024	NeurIPS 2024
		Steve-eye: Equipping llm-based embodied agents with visual perception in open worlds	2024	ICLR 2024
		Can we trust embodied agents? exploring backdoor attacks against embodied llm based decision-making systems	2025	ICLR 2025
		Opex: A component-wise analysis of llm-centric agents in embodied instruction following	2024	ACL 2024
Urban Computing		Geoagent: To empower llms using geospatial tools for address standardization	2024	ACL 2024
		Large language models as urban residents: An LLM agent framework for personal mobility generation	2024	NeurIPS 2024
		Urbankgent: A unified large language model agent framework for urban knowledge graph construction	2024	NeurIPS 2024
Social Media		Ask, acquire, understand: A multimodal agent-based framework for social abuse detection in memes	2025	ACM 2025
Social Media		Local: Logical and causal fact-checking with llm-based multi-agents	2025	ACM 2025
Autonomous Driving		SMART: scalable multi agent real-time motion generation via next-token prediction	2024	NeurIPS 2024
Autonomous Driving		Llmlight: Large language models as traffic signal control agents	2025	KDD 2025
Literary Creation		IBSEN: director-actor agent collaboration for controllable and interactive drama script generation	2024	ACL 2024
Literary Creation		Agents’ room: Narrative generation through multi-step collaboration	2025	ICLR 2025

Values Alignment Evaluation for Agent Systems

Methodologies for Agent Value Alignment

Datasets

Time	Dataset	Paper	Keywords	Level	Venue
2025	DEFSurveySim	Towards realistic evaluation of cultural value alignment in large language models: Diversity enhancement for survey response simulation	Nation, Culture	Macro Level, Meso Level	Information Processing & Management
2025	NaVAB	Benchmarking Multi-National Value Alignment for Large Language Models	Nation	Meso Level	arXiv preprint arXiv:2504.12911
2025	German Credit Data	EARN Fairness: Explaining, Asking, Reviewing, and Negotiating Artificial Intelligence Fairness Metrics Among Stakeholders	Company governance	Micro Level	Proceedings of the ACM on Human-Computer Interaction}
2024	CultureSPA	Self-Pluralising Culture Alignment for Large Language Models	Nation, Culture	Macro Level, Meso Level	arXiv preprint arXiv:2410.12971
2024	DailyDilemmas	DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life	Harmlessness, Responsibility, Justice & Fairness, Harmlessness, Virtue	Macro Level	arXiv preprint arXiv:2410.02683
2024	HofstedeCulturalDimensions	How Well Do LLMs Represent Values Across Cultures? Empirical Analysis of LLM Responses Based on Hofstede Cultural Dimensions	Culture	Macro Level	arXiv preprint arXiv:2406.14805
2024	IndieValueCatalog	Can Language Models Reason about Individualistic Human Values and Preferences?	Justice & Fairness	Macro Level	arXiv preprint arXiv:2410.03868
2024	KorNAT	KorNAT: LLM Alignment Benchmark for Korean Social Values and Common Knowledge	Nation, Culture,Justice & Fairness	Macro Level, Meso Level	arXiv preprint arXiv:2402.13605
2024	LLMGlobe	LLM-GLOBE: A Benchmark Evaluating the Cultural Values Embedded in LLM Output	Harmlessness, Justice & Fairness, Privacy, Beneficence, Responsibility	Macro Level	arXiv preprint arXiv:2411.06032
2024	LaWGPT	Lawyer GPT: A Legal Large Language Model with Enhanced Domain Knowledge and Reasoning Capabilities	Fairness in legal	Macro Level, Micro Level	Proceedings of the 2024 3rd International Symposium on Robotics, Artificial Intelligence and Information Engineering
2024	Moral Beliefs	Evaluating Moral Beliefs across LLMs through a Pluralistic Framework	Nation, Culture,Justice & Fairness, Solidarity, Sustainability, Transparency	Macro Level, Meso Level	arXiv preprint arXiv:2411.03665
2024	Moral Stories	Measuring Human-AI Value Alignment in Large Language Models	Harmlessness, Justice & Fairness, Responsibility, Beneficence, Dignity, Virtue, Freedom & Autonomy	Macro Level	Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society
2024	PkuSafeRLHF	Pku-saferlhf: Towards multi-level safety alignment for llms with human preference	Harmlessness, Freedom & Autonomy, Justice & Fairness,Turst, Privacy, Responsibility， Beneficence	Macro Level, Meso Level	arXiv preprint arXiv:2406.15513
2024	ProgressGym	ProgressGym: Alignment with a Millennium of Moral Progress	Harmlessness， Freedom & Autonomy， Turst， Dignity， Beneficence	Macro Level	Advances in Neural Information Processing Systems
2024	SafeSora	SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset	Harmlessness,Usefulness，Responsibility	Macro Level	Advances in Neural Information Processing Systems
2023	MFQ(Moral Foundations Questionnaire)	Moral Foundations of Large Language Models	Turst, Responsibility	Macro Level
2023	BeaverTails	Beavertails: Towards improved safety alignment of llm via a human-preference dataset	Harmlessness, Justice & Fairness, Privacy, Beneficence, Responsibility	Macro Level	Advances in Neural Information Processing Systems
2023	CBBQ(Chinese Bias Benchmark Dataset)	CBBQ: A Chinese Bias Benchmark Dataset Curated with Human-AI Collaboration for Large Language Models	China(Safeguarding national security and adhering to the core socialist values)	Meso Level	arXiv preprint arXiv:2306.16244
2023	CDEval	CDEval: A Benchmark for Measuring the Cultural Dimensions of Large Language Models	Cultural, Eduaction, Individualism	Macro Level, Meso Level	arXiv preprint arXiv:2311.16421
2023	CORGI-PM	CORGI-PM: A Chinese Corpus For Gender Bias Probing and Mitigation	Justice & Fairness	Macro Level	arXiv preprint arXiv:2301.00395
2023	Cvalues	Cvalues: Measuring the values of chinese large language models from safety to responsibility.	Harmlessness, Responsibility	Macro Level, Meso Level	arXiv preprint arXiv:2307.09705
2023	DecodingTrust	DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models	Privacy, Justice & Fairness, Harmlessness	Macro Level	Cited on
2023	EEC(Equity Evaluation Corpus)	Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems	Harmlessness	Macro Level	arXiv preprint arXiv:2311.04892
2023	Flames	Flames: Benchmarking Value Alignment of LLMs in Chinese	Justice & Fairness, Responsibility, Harmlessness, Privacy	Macro Level	arXiv preprint arXiv:2311.06899
2023	GlobalOpinionQA	Towards Measuring the Representation of Subjective Global Opinions in Language Models	Nation, Culture	Macro Level, Meso Level	arXiv preprint arXiv:2306.16388
2023	Persona Bias	Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs	Dignity	Macro Level	arXiv preprint arXiv:2311.04892
2023	Social Chemistry 101	TrustGPT: A Benchmark for Trustworthy and Responsible Large Language Models	Justice & Fairness,Harmlessness， Responsibility， Dignity， Beneficence	Macro Level	arXiv preprint arXiv:2306.11507
2023	ToxiGen	An Empirical Study of Metrics to Measure Representational Harms in Pre-Trained Language Models	Justice & Fairness	Macro Level	arXiv preprint arXiv:2301.09211
2022	CDial-Bias	Towards Identifying Social Bias in Dialog Systems: Frame, Datasets, and Benchmarks	Virtue	Macro Level	arXiv preprint arXiv:2202.08011
2022	Moral Integrity Corpus	The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems	Justice & Fairness, Responsibility, Beneficence, Dignity, Virtue	Macro Level	arXiv preprint arXiv:2204.03021
2022	MoralExceptQA	When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment	Solidarity, Harmlessness, Responsibility, Beneficence, Dignity	Macro Level	Advances in neural information processing systems
2022	ValueNet	Valuenet: A new dataset for human value driven dialogue system.	Freedom & Autonomy，Beneficence， Harmlessness， Dignity， Freedom & Autonomy	Macro Level	Proceedings of the AAAI Conference on Artificial Intelligence
2021	BBQ(Bias Benchmark for QA)	BBQ: A Hand-Built Bias Benchmark for Question Answering	Justice & Fairness	Macro Level	arXiv preprint arXiv:2110.08193
2021	BOLD	BOLD: Dataset and Metrics for Measuring Biases in Open-Ended Language Generation	Justice & Fairness	Macro Level	Proceedings of the 2021 ACM conference on fairness, accountability, and transparency
2021	Scruples	Scruples: A Corpus of Community Ethical Judgments on 32,000 Real-Life Anecdotes	Beneficence	Macro Level	Proceedings of the AAAI Conference on Artificial Intelligence
2020	CrowS-Paris	CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models	Justice & Fairness	Macro Level	arXiv preprint arXiv:2010.00133
2020	ETHICS	Aligning AI With Shared Human Values	Justice & Fairness, Responsibility, Beneficence, Dignity, Usefulness		arXiv preprint arXiv:2008.02275
2020	StereoSet	StereoSet: Measuring stereotypical bias in pretrained language models	Justice & Fairness	Macro Level	arXiv preprint arXiv:2004.09456
2020	UnQover	UnQovering Stereotyping Biases via Underspecified Questions	Justice & Fairness	Macro Level	arXiv preprint arXiv:2010.02428
2019	Social Bias Frames	Social Bias Frames: Reasoning about Social and Power Implications of Language	Freedom & Autonomy	Macro Level	arXiv preprint arXiv:1911.03891
2019	WikiGenderBias	Towards Understanding Gender Bias in Relation Extraction	Dignity	Macro Level	arXiv preprint arXiv:1911.03642
2018	WinoBias	Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods	Solidarity	Macro Level	arXiv preprint arXiv:1804.06876
2018	WinoGender	Gender Bias in Coreference Resolution	Justice & Fairness	Macro Level	arXiv preprint arXiv:1804.09301

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
README.md		README.md
classification-of-application.png		classification-of-application.png
future-directions.png		future-directions.png
multi-level-value-alignment-principles.png		multi-level-value-alignment-principles.png
overview.png		overview.png
takeaways-of-application-scenarios.png		takeaways-of-application-scenarios.png
takeaways-of-comparisons-of-representative-survey-papers.png		takeaways-of-comparisons-of-representative-survey-papers.png
takeaways-of-future-directions.png		takeaways-of-future-directions.png
takeaways-of-value-principles.png		takeaways-of-value-principles.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Value-Alignment-Agentic-AI-Papers-Survey-Taxonomy

Bookmarks

Papers

Overview of Our Survey

Related Survey

The Principles of Values Alignment

The Macro Level Principles of Values Alignment

The Meso Level Principles of Values Alignment

The Micro Level Principles of Values Alignment

Agent System Application

High Generaliability

Limited Generaliability

Values Alignment Evaluation for Agent Systems

Methodologies for Agent Value Alignment

Datasets

Future Directions

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Wei-ZENG1020/Value-Alignment-Agentic-AI-Papers-Survey-Taxonomy

Folders and files

Latest commit

History

Repository files navigation

Value-Alignment-Agentic-AI-Papers-Survey-Taxonomy

Bookmarks

Papers

Overview of Our Survey

Related Survey

The Principles of Values Alignment

The Macro Level Principles of Values Alignment

The Meso Level Principles of Values Alignment

The Micro Level Principles of Values Alignment

Agent System Application

High Generaliability

Limited Generaliability

Values Alignment Evaluation for Agent Systems

Methodologies for Agent Value Alignment

Datasets

Future Directions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Packages