Skip to content

Study the patterns of retraining the pipeline by looking at experiment log #5

@luiscruz

Description

@luiscruz

Would the computational costs be reduced while using static analysis tools to detect particular frequent defects/issues?

Motivational example:

  • When executing the pipeline often there is a mismatch of data shape. this is only detected at run time, many times after a considerable execution time.
    • Does the computational costs saved justify the time costs/overheads involved in setting up the static analysis tools.
  • Can we map experiment log to particular defects in the project?

Background / State of the Art:

  • Some environments (e.g. AWS) have similar infrastructure already setup for code-analysis to detect potential issues

Use of experiment logs to measure reproducibility of data science experiments. (If multiple runs of experiment detected in log, did both result in the same outcome?)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions