Skip to content

Commit 1f0a0b9

Browse files
enhance: workforce quality check (#3254)
Co-authored-by: Wendong-Fan <133094783+Wendong-Fan@users.noreply.github.com> Co-authored-by: Wendong-Fan <w3ndong.fan@gmail.com>
1 parent da64e4b commit 1f0a0b9

File tree

5 files changed

+835
-299
lines changed

5 files changed

+835
-299
lines changed

camel/societies/workforce/prompts.py

Lines changed: 131 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -293,58 +293,139 @@
293293
- Written without any relative references (e.g., "the previous task").
294294
"""
295295

296-
FAILURE_ANALYSIS_PROMPT = TextPrompt(
297-
"""You need to analyze a task failure and decide on the best recovery strategy.
298-
299-
**TASK FAILURE DETAILS:**
300-
Task ID: {task_id}
301-
Task Content: {task_content}
302-
Failure Count: {failure_count}/3
303-
Error Message: {error_message}
304-
Worker ID: {worker_id}
305-
Task Depth: {task_depth}
306-
Additional Info: {additional_info}
307-
308-
**AVAILABLE RECOVERY STRATEGIES:**
309-
310-
1. **RETRY**: Attempt the same task again without changes
311-
- Use for: Network errors, temporary API issues, random failures
312-
- Avoid for: Fundamental task misunderstanding, capability gaps
313-
314-
2. **REPLAN**: Modify the task content to address the underlying issue
315-
- Use for: Unclear requirements, insufficient context, correctable errors
316-
- Provide: Modified task content that addresses the failure cause
317-
- **CRITICAL**: The replanned task MUST be a clear, actionable
318-
instruction for an AI agent, not a question or request for a human.
319-
320-
3. **DECOMPOSE**: Break the task into smaller, more manageable subtasks
321-
- Use for: Complex tasks, capability mismatches, persistent failures
322-
- Consider: Whether the task is too complex for a single worker
323-
324-
4. **CREATE_WORKER**: Create a new worker node to handle the task
325-
- Use for: Fundamental task misunderstanding, capability gaps
326-
327-
**ANALYSIS GUIDELINES:**
328-
329-
- **Connection/Network Errors**: Almost always choose RETRY
330-
- **Model Processing Errors**: Consider REPLAN if the task can be clarified, otherwise DECOMPOSE
331-
- **Capability Gaps**: Choose DECOMPOSE to break into simpler parts. If a
332-
replan can work, ensure the new task is a command for an agent, not a
333-
request to a user.
334-
- **Ambiguous Requirements**: Choose REPLAN with clearer instructions
335-
- **High Failure Count**: Lean towards DECOMPOSE rather than repeated retries
336-
- **Deep Tasks (depth > 2)**: Prefer RETRY or REPLAN over further
337-
decomposition
296+
TASK_ANALYSIS_PROMPT = TextPrompt(
297+
"""You are analyzing a task to evaluate its quality and determine recovery actions if needed.
298+
299+
**TASK INFORMATION:**
300+
- Task ID: {task_id}
301+
- Task Content: {task_content}
302+
- Task Result: {task_result}
303+
- Failure Count: {failure_count}
304+
- Task Depth: {task_depth}
305+
- Assigned Worker: {assigned_worker}
306+
307+
**ISSUE TYPE: {issue_type}**
308+
309+
{issue_specific_analysis}
310+
311+
**STEP 1: EVALUATE TASK QUALITY**
312+
313+
First, assess whether the task was completed successfully and meets quality standards:
314+
315+
**For Task Failures (with error messages):**
316+
- The task did not complete successfully
317+
- An error occurred during execution
318+
- Quality is automatically insufficient
319+
- Focus on analyzing the error cause
320+
321+
**For Quality Issues (task completed but needs evaluation):**
322+
Evaluate the task result based on these criteria:
323+
1. **Completeness**: Does the result fully address all task requirements?
324+
2. **Accuracy**: Is the result correct and well-structured?
325+
3. **Missing Elements**: Are there any missing components or quality issues?
326+
327+
Provide:
328+
- Quality score (0-100): Objective assessment of result quality
329+
- Specific issues list: Any problems found in the result
330+
- Quality sufficient: Boolean indicating if quality meets standards
331+
332+
**STEP 2: DETERMINE RECOVERY STRATEGY (if quality insufficient)**
333+
334+
If the task quality is insufficient, select the best recovery strategy:
335+
336+
**Available Strategies:**
337+
338+
1. **retry** - Retry with the same worker and task content
339+
- **Best for**:
340+
* Network errors, connection timeouts, temporary API issues
341+
* Random failures that are likely temporary
342+
* Minor quality issues that may resolve on retry
343+
- **Not suitable for**:
344+
* Fundamental task misunderstandings
345+
* Worker capability gaps
346+
* Persistent quality problems
347+
348+
2. **reassign** - Assign to a different worker
349+
- **Best for**:
350+
* Current worker lacks required skills/expertise
351+
* Worker-specific quality issues
352+
* Task requires different specialization
353+
- **Not suitable for**:
354+
* Task description is unclear (use replan instead)
355+
* Task is too complex (use decompose instead)
356+
- **Note**: Only available for quality issues, not failures
357+
358+
3. **replan** - Modify task content with clearer instructions
359+
- **Best for**:
360+
* Unclear or ambiguous requirements
361+
* Missing context or information
362+
* Task description needs improvement
363+
- **Requirements**:
364+
* Provide modified_task_content with enhanced, clear instructions
365+
* Modified task must be actionable for an AI agent
366+
* Address the root cause identified in issues
367+
368+
4. **decompose** - Break into smaller, manageable subtasks
369+
- **Best for**:
370+
* Task is too complex for a single worker
371+
* Multiple distinct sub-problems exist
372+
* Persistent failures despite retries
373+
* Capability mismatches that need specialization
374+
- **Consider**:
375+
* Task depth (avoid if depth > 2)
376+
* Whether subtasks can run in parallel
377+
378+
5. **create_worker** - Create new specialized worker
379+
- **Best for**:
380+
* No existing worker has required capabilities
381+
* Need specialized skills not currently available
382+
- **Consider**:
383+
* Whether decomposition could work instead
384+
* Cost of creating new worker vs alternatives
385+
- **Note**: Only available for task failures, not quality issues
386+
387+
**DECISION GUIDELINES:**
388+
389+
**Priority Rules:**
390+
1. Connection/Network Errors → **retry** (almost always)
391+
2. Deep Tasks (depth > 2) → Avoid decompose, prefer **retry** or **replan**
392+
3. Worker Skill Mismatch → **reassign** (quality) or **decompose** (failure)
393+
4. Unclear Requirements → **replan** with specifics
394+
5. Task Too Complex → **decompose** into subtasks
338395
339396
**RESPONSE FORMAT:**
340-
You must return a valid JSON object with these fields:
341-
- "strategy": one of "retry", "replan", or "decompose"
342-
- "reasoning": explanation for your choice (1-2 sentences)
343-
- "modified_task_content": new task content if strategy is "replan", null otherwise
397+
{response_format}
344398
345-
**Example Response:**
346-
{{"strategy": "retry", "reasoning": "The connection error appears to be temporary and network-related, a simple retry should resolve this.", "modified_task_content": null}}
347-
348-
**CRITICAL**: Return ONLY the JSON object. No explanations or text outside the JSON structure.
399+
**CRITICAL**:
400+
- Return ONLY a valid JSON object
401+
- No explanations or text outside the JSON structure
402+
- Ensure all required fields are included
403+
- Use null for optional fields when not applicable
349404
"""
350405
)
406+
407+
FAILURE_ANALYSIS_RESPONSE_FORMAT = """JSON format:
408+
{
409+
"reasoning": "explanation (1-2 sentences)",
410+
"recovery_strategy": "retry|replan|decompose|create_worker",
411+
"modified_task_content": "new content if replan, else null",
412+
"issues": ["error1", "error2"]
413+
}"""
414+
415+
QUALITY_EVALUATION_RESPONSE_FORMAT = """JSON format:
416+
{
417+
"quality_score": 0-100,
418+
"reasoning": "explanation (1-2 sentences)",
419+
"issues": ["issue1", "issue2"],
420+
"recovery_strategy": "retry|reassign|replan|decompose or null",
421+
"modified_task_content": "new content if replan, else null"
422+
}"""
423+
424+
TASK_AGENT_SYSTEM_MESSAGE = """You are an intelligent task management assistant responsible for planning, analyzing, and quality control.
425+
426+
Your responsibilities include:
427+
1. **Task Decomposition**: Breaking down complex tasks into manageable subtasks that can be executed efficiently and in parallel when possible.
428+
2. **Failure Analysis**: Analyzing task failures to determine the root cause and recommend appropriate recovery strategies (retry, replan, decompose, or create new worker).
429+
3. **Quality Evaluation**: Assessing completed task results to ensure they meet quality standards and recommending recovery strategies if quality is insufficient (retry, reassign, replan, or decompose).
430+
431+
You must provide structured, actionable analysis based on the task context, failure history, worker capabilities, and quality criteria. Your decisions directly impact the efficiency and success of the workforce system."""

camel/societies/workforce/structured_output_handler.py

Lines changed: 30 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,8 @@
1919

2020
from camel.logger import get_logger
2121
from camel.societies.workforce.utils import (
22-
RecoveryDecision,
2322
RecoveryStrategy,
23+
TaskAnalysisResult,
2424
TaskAssignResult,
2525
WorkerConf,
2626
)
@@ -65,9 +65,9 @@ class StructuredOutputHandler:
6565
r'description.*?:\s*"([^"]+)"'
6666
),
6767
],
68-
'RecoveryDecision': [
69-
r'"strategy"\s*:\s*"([^"]+)".*?"reasoning"\s*:\s*"([^"]+)"',
70-
r'strategy.*?:\s*"([^"]+)".*?reasoning.*?:\s*"([^"]+)"',
68+
'TaskAnalysisResult': [
69+
r'"recovery_strategy"\s*:\s*"([^"]+)".*?"reasoning"\s*:\s*"([^"]+)"',
70+
r'recovery_strategy.*?:\s*"([^"]+)".*?reasoning.*?:\s*"([^"]+)"',
7171
],
7272
}
7373

@@ -239,27 +239,38 @@ def _extract_with_schema_patterns(
239239
except (IndexError, AttributeError):
240240
continue
241241

242-
elif schema_name == 'RecoveryDecision':
242+
elif schema_name == 'TaskAnalysisResult':
243243
for pattern in patterns:
244244
match = re.search(pattern, text, re.DOTALL | re.IGNORECASE)
245245
if match:
246246
try:
247-
strategy = match.group(1)
247+
recovery_strategy = match.group(1)
248248
reasoning = match.group(2)
249249
# Look for modified_task_content
250250
content_match = re.search(
251251
r'"modified_task_content"\s*:\s*"([^"]*)"',
252252
text,
253253
re.IGNORECASE,
254254
)
255+
# Look for quality_score (for quality evaluation)
256+
score_match = re.search(
257+
r'"quality_score"\s*:\s*(\d+)',
258+
text,
259+
re.IGNORECASE,
260+
)
255261
return {
256-
'strategy': strategy,
262+
'recovery_strategy': recovery_strategy,
257263
'reasoning': reasoning,
258264
'modified_task_content': (
259265
content_match.group(1)
260266
if content_match
261267
else None
262268
),
269+
'quality_score': (
270+
int(score_match.group(1))
271+
if score_match
272+
else None
273+
),
263274
}
264275
except (IndexError, AttributeError):
265276
continue
@@ -370,21 +381,22 @@ def _fix_common_issues(
370381
else:
371382
assignment['dependencies'] = []
372383

373-
elif schema_name == 'RecoveryDecision':
374-
# Ensure strategy is valid
375-
if 'strategy' in fixed_data:
376-
strategy = fixed_data['strategy'].lower()
384+
elif schema_name == 'TaskAnalysisResult':
385+
# Ensure recovery_strategy is valid
386+
if 'recovery_strategy' in fixed_data:
387+
strategy = fixed_data['recovery_strategy'].lower()
377388
valid_strategies = [
378389
'retry',
379390
'replan',
380391
'decompose',
381392
'create_worker',
393+
'reassign',
382394
]
383395
if strategy not in valid_strategies:
384396
# Try to match partial
385397
for valid in valid_strategies:
386398
if valid.startswith(strategy) or strategy in valid:
387-
fixed_data['strategy'] = valid
399+
fixed_data['recovery_strategy'] = valid
388400
break
389401

390402
return fixed_data
@@ -410,10 +422,10 @@ def _create_default_instance(schema: Type[BaseModel]) -> BaseModel:
410422
sys_msg="You are a helpful assistant.",
411423
description="A general-purpose worker",
412424
)
413-
elif schema_name == 'RecoveryDecision':
414-
return RecoveryDecision(
415-
strategy=RecoveryStrategy.RETRY,
425+
elif schema_name == 'TaskAnalysisResult':
426+
return TaskAnalysisResult(
416427
reasoning="Unable to parse response, defaulting to retry",
428+
recovery_strategy=RecoveryStrategy.RETRY,
417429
modified_task_content=None,
418430
)
419431
else:
@@ -482,11 +494,11 @@ def create_fallback_response(
482494
description=f"Fallback worker for task: {task_content}...",
483495
)
484496

485-
elif schema_name == 'RecoveryDecision':
497+
elif schema_name == 'TaskAnalysisResult':
486498
# Default to retry strategy
487-
return RecoveryDecision(
488-
strategy=RecoveryStrategy.RETRY,
499+
return TaskAnalysisResult(
489500
reasoning=f"Fallback decision due to: {error_message}",
501+
recovery_strategy=RecoveryStrategy.RETRY,
490502
modified_task_content=None,
491503
)
492504

0 commit comments

Comments
 (0)