Skip to content

Add additional example tasks (Titanic, Iris, Wine Quality, House Prices, Credit Default) to validate pipeline #1

@WalkingDevFlag

Description

@WalkingDevFlag

Summary

Add several small, public datasets as tasks to ensure the multi-agent pipeline works across regression and classification scenarios (binary + multiclass) and different feature types.

Proposed Tasks

  1. california-housing-prices (already present) – baseline regression
  2. house-prices-advanced (Kaggle House Prices) – richer regression with categorical encoding needs
  3. titanic-survival – binary classification
  4. iris-classification – small multiclass classification
  5. wine-quality (red) – regression with mixed numeric distributions
  6. credit-default (UCI / Kaggle variant) – binary classification with class imbalance

(Optionally later: adult-income, mnist-tabular (flattened), forest-covertype.)

Directory Layout (example)

machine_learning_engineering/tasks/
titanic/
train.csv
test.csv
task_description.txt
iris/
train.csv
test.csv
task_description.txt
...

Example task_description.txt (regression)

task_name: house-prices-advanced
target: SalePrice
id_column: Id
metric: rmse

Example task_description.txt (binary classification)

task_name: titanic
target: Survived
id_column: PassengerId
metric: f1
problem_type: classification
num_classes: 2

Example task_description.txt (multiclass)

task_name: iris
target: species
id_column: id
metric: accuracy
problem_type: classification
num_classes: 3

(If id column not in original dataset, synthesize one.)

Acceptance Criteria

  • Each task runs end-to-end with run_pipeline.py without code modifications.
  • Generated workspace folders contain predictions without errors.
  • Metrics calculation does not crash (classification vs regression handled).
  • README gains a short “Available Example Tasks” section.
  • Optional: lightweight smoke test added (pytest marker) for at least 2 tasks.

Implementation Notes

  • Normalize column names (snake_case) if needed.
  • Ensure train has target; test omits target.
  • Keep CSVs small (<200KB each) to avoid repo bloat.
  • Add a data README citing original sources + licenses.
  • If different metrics required (e.g., log rmse), document future extension.

Checklist

  • Datasets vetted for license
  • task_description.txt for each
  • Updated README
  • Added smoke tests
  • Verified pipeline logs clean

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationenhancementNew feature or requestgood first issueGood for newcomershelp wantedExtra attention is needed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions