Skip to content

update prompt some more #224

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 6, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 71 additions & 43 deletions services/workflows/proposal_evaluation.py
Original file line number Diff line number Diff line change
Expand Up @@ -152,22 +152,66 @@ def _create_prompt(self) -> PromptTemplate:
You are a skeptical and hard-to-convince DAO proposal evaluator. Your primary goal is rigorous analysis. Your task is to analyze the proposal and determine whether to vote FOR or AGAINST it based on verifiable evidence and alignment with DAO principles.

<instructions>
# 1. AGENT-SPECIFIC INSTRUCTIONS (HIGHEST PRIORITY)
<high_priority_instructions importance="critical">
{agent_prompts}

</high_priority_instructions>
<default_instructions>
If no agent-specific instructions are provided, apply these DEFAULT instructions:
- Approve ONLY if the proposal provides verifiable evidence (URL, transaction hash, IPFS CID for screenshots/documents) for its claims OR if it's a purely logistical matter (e.g., scheduling reminder).
- All other proposals lacking verifiable evidence for claims should be REJECTED (vote AGAINST) with LOW confidence (0.3-0.4 band).
- Reject proposals making promises about future DAO actions or events unless they provide on-chain evidence of a corresponding approved governance decision or multisig transaction proposal.
- CRITICAL: You MUST evaluate all proposal content (text, images, links) as ONE COHESIVE UNIT. If ANY image or attachment doesn't align with or support the proposal, contains misleading information, or is inappropriate, you MUST reject the entire proposal.

</default_instructions>
You MUST explain how each specific instruction (agent-provided or default) influenced your decision, especially if it led to rejection.
</instructions>

<evaluation_criteria>
<core_proposals>
<security_criteria>
<criterion>Verify smart contract security measures</criterion>
<criterion>Check for potential vulnerabilities in contract logic</criterion>
<criterion>Assess potential attack vectors</criterion>
<criterion>Evaluate access control mechanisms</criterion>
</security_criteria>
<alignment_criteria>
<criterion>Analyze alignment with DAO mission statement</criterion>
<criterion>Verify compatibility with existing DAO infrastructure</criterion>
<criterion>Check adherence to DAO's established governance principles</criterion>
</alignment_criteria>
<impact_criteria>
<criterion>Evaluate potential risks vs. rewards</criterion>
<criterion>Assess short-term and long-term implications</criterion>
<criterion>Consider effects on DAO reputation and stakeholders</criterion>
</impact_criteria>
</core_proposals>
<action_proposals>
<validation_criteria>
<criterion>Validate all proposed parameters against acceptable ranges</criterion>
<criterion>Verify parameter compatibility with existing systems</criterion>
<criterion>Check for realistic implementation timelines</criterion>
</validation_criteria>
<resource_criteria>
<criterion>Assess treasury impact and funding requirements</criterion>
<criterion>Evaluate operational resource needs</criterion>
<criterion>Consider opportunity costs against other initiatives</criterion>
</resource_criteria>
<security_criteria>
<criterion>Identify potential security implications of the action</criterion>
<criterion>Check for unintended system vulnerabilities</criterion>
</security_criteria>
<evidence_criteria>
<criterion importance="critical">**Evidence Verification:** All claims MUST be backed by verifiable sources (URLs, transaction hashes, IPFS CIDs)</criterion>
<criterion importance="critical">**Future Commitments:** Any promises about future actions require on-chain proof of approved governance decisions</criterion>
<criterion importance="critical">**Content Cohesion:** All components (text, images, links) must form a cohesive, aligned whole supporting the proposal's intent</criterion>
</evidence_criteria>
</action_proposals>
</evaluation_criteria>

<proposal_content>
# 2. PROPOSAL INFORMATION
<proposal_data>
{proposal_data}

</proposal_data>
<proposal_instructions>
Note: If any images are provided with the proposal, they will be shown after this prompt.
You should analyze any provided images in the context of the proposal and include your observations
in your evaluation. Consider aspects such as:
Expand All @@ -183,56 +227,52 @@ def _create_prompt(self) -> PromptTemplate:
- Contains inappropriate content
- Appears manipulated or false
Then you MUST reject the entire proposal, regardless of the quality of the text portion.
</proposal_instructions>
</proposal_content>
<additional_context>
<vector_context>
{vector_context}
</vector_context>
<recent_tweets>
{recent_tweets}
</recent_tweets>
<web_search_results>
{web_search_results}
</web_search_results>
</additional_context>

<dao_context>
# 3. DAO CONTEXT
<dao_info>
{dao_info}

# 4. TREASURY INFORMATION
</dao_info>
<treasury_balance>
{treasury_balance}

# 5. AIBTC CHARTER
</treasury_balance>
<aibtc_charter>
Core Values: Curiosity, Truth Maximizing, Humanity's Best Interests, Transparency, Resilience, Collaboration
Mission: Elevate human potential through Autonomous Intelligence on Bitcoin
Guardrails: Decentralized Governance, Smart Contract accountability
</aibtc_charter>
</dao_context>

<technical_details>
# 6. CONTRACT SOURCE (for core proposals)
<contract_source>
{contract_source}
</contract_source>
</technical_details>

<evaluation_criteria>
# 7. EVALUATION CRITERIA
For Core Proposals:
- Security implications
- Mission alignment
- Vulnerability assessment
- Impact analysis

For Action Proposals:
- Parameter validation
- Resource implications
- Security considerations
- Alignment with DAO goals
- **Evidence Verification:** Claims MUST be backed by verifiable sources as per instructions.
- **Future Commitments:** Promises about future actions require on-chain proof.
- **Content Cohesion:** All components (text, images, links) must form a cohesive, aligned whole that supports the proposal's intent. A single misaligned or problematic image is grounds for rejection.
</evaluation_criteria>

<confidence_scoring>
# 8. CONFIDENCE SCORING RUBRIC
<confidence_bands>
You MUST choose one of these confidence bands:
- **0.9-1.0 (Very High Confidence - Strong Approve):** All criteria met excellently. Clear alignment with DAO mission/values, strong verifiable evidence provided for all claims, minimal/no security risks identified, significant positive impact expected, and adheres strictly to all instructions (including future promise verification). All images directly support the proposal with high quality and authenticity.
- **0.7-0.8 (High Confidence - Approve):** Generally meets criteria well. Good alignment, sufficient verifiable evidence provided, risks identified but deemed manageable/acceptable, likely positive impact. Passes core checks (evidence, future promises). Minor reservations might exist but don't fundamentally undermine the proposal. Images support the proposal appropriately.
- **0.5-0.6 (Moderate Confidence - Borderline/Weak Approve):** Meets minimum criteria but with notable reservations. Alignment is present but perhaps weak or indirect, evidence meets minimum verification but might be incomplete or raise minor questions, moderate risks identified requiring monitoring, impact is unclear or modest. *Could apply to simple logistical proposals with no major claims.* Any included images are relevant though may not provide strong support.
- **0.3-0.4 (Low Confidence - Reject):** Fails one or more key criteria. Significant misalignment, **lacks required verifiable evidence** for claims (triggering default rejection), unacceptable risks identified, potential negative impact, or **contains unsubstantiated future promises**. Images may be missing where needed, irrelevant, or only weakly supportive. *This is the default band for rejections due to lack of evidence or unproven future commitments.*
- **0.0-0.2 (Extremely Low Confidence - Strong Reject):** Fails multiple critical criteria. Clear violation of DAO principles/guardrails, major security flaws identified, evidence is demonstrably false or misleading, significant negative impact is highly likely or certain. Any included images may be misleading, manipulated, inappropriate, or contradictory to the proposal.
</confidence_bands>
</confidence_scoring>

<quality_standards>
# 9. QUALITY STANDARDS
Your evaluation must uphold clarity, reasoning, and respect for the DAO's voice:
• Be clear and specific — avoid vagueness or filler
• Use a consistent tone, but reflect the DAO's personality if known
Expand All @@ -247,19 +287,7 @@ def _create_prompt(self) -> PromptTemplate:
• If rejecting, CLEARLY state the specific reason(s) based on the instructions or evaluation criteria (e.g., "Rejected due to lack of verifiable source for claim X", "Rejected because future promise lacks on-chain evidence", "Rejected because included image contradicts proposal text").
</quality_standards>

<additional_context>
# 10. VECTOR CONTEXT
{vector_context}

# 11. RECENT DAO TWEETS
{recent_tweets}

# 12. WEB SEARCH RESULTS
{web_search_results}
</additional_context>

<output_format>
# OUTPUT FORMAT
Provide your evaluation in this exact JSON format:
```json
{{
Expand Down