Skip to content

MLOps Code Repository Review #78

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
fmind opened this issue Apr 14, 2025 · 0 comments
Open

MLOps Code Repository Review #78

fmind opened this issue Apr 14, 2025 · 0 comments

Comments

@fmind
Copy link
Owner

fmind commented Apr 14, 2025

This repository demonstrates a well-structured MLOps project, exhibiting characteristics across multiple maturity levels. It leverages modern Python tooling and MLOps practices, making it a solid foundation for building and deploying machine learning applications.

General Summary:

The repository showcases a mature MLOps project, evident from its comprehensive tooling, CI/CD workflows, documentation, and adherence to software engineering principles. The use of cruft for project templating, uv for package management, and MLflow for experiment tracking and model registry highlights a commitment to reproducibility, automation, and collaboration. The inclusion of notebooks for data processing and model explanation further enhances the project's usability and educational value.

Guidelines for Improvements:

While the repository demonstrates a high level of MLOps maturity, there are areas where further improvements can be made to reach GA (General Availability) level:

  • Enforced Test Coverage:

    • Issue: The CI workflow does not explicitly enforce a minimum test coverage percentage.
    • Fix: Modify the check-coverage task in tasks/check.just and the check.yml workflow to include a check that fails the build if the coverage falls below a defined threshold (e.g., 80%). This ensures that all new code is adequately tested.
  • Deterministic Builds:

    • Issue: While uv and constraints are used, a lock file (uv.lock) is not present to guarantee deterministic builds.
    • Fix: Generate and commit a uv.lock file to the repository. Update the build process (e.g., in justfile or CI workflow) to use the lock file during package installation, ensuring that the exact same versions of dependencies are used across all environments.
  • Formal Release Management:

    • Issue: While a CHANGELOG.md exists and Git tags are likely used, the CI/CD workflow doesn't fully automate the release process, including generating release notes.
    • Fix: Enhance the publish.yml workflow to automatically create GitHub releases with release notes based on the CHANGELOG.md content when a new tag is pushed. This can be achieved using tools like semantic-release or custom scripts that parse the changelog and generate the release notes.
  • Comprehensive Documentation:

    • Issue: While API documentation is generated, the README lacks badges for key metrics like test coverage and code quality.
    • Fix: Add badges to the README.md file to display the build status, test coverage percentage, code quality score (e.g., from Ruff), and other relevant metrics. This provides a quick overview of the project's health and maturity.
  • Monitoring/Evaluation Artifacts:

    • Issue: The code does not include explicit jobs or scripts for model evaluation using tools like mlflow.evaluate or Evidently to generate evaluation reports.
    • Fix: Implement model evaluation jobs or scripts that use tools like mlflow.evaluate or Evidently to compute relevant metrics and generate evaluation reports. These reports should be saved as artifacts in MLflow for tracking and analysis.
  • Lineage Tracking:

    • Issue: The code does not demonstrate the use of lineage tracking features like mlflow.log_input with MLflow Datasets.
    • Fix: Incorporate lineage tracking features into the code, particularly in data processing and model training jobs. Use mlflow.log_input with MLflow Datasets to track the data sources and transformations used in each step of the pipeline.
  • Explainability Artifacts:

    • Issue: The code does not include jobs or scripts to generate model explanations (e.g., using SHAP) and save these as artifacts.
    • Fix: Add jobs or scripts to generate model explanations using tools like SHAP and save these explanations as artifacts in MLflow. This allows for better understanding and debugging of model behavior.
  • Infrastructure Metrics Logging:

    • Issue: The code does not utilize system metrics logging (e.g., mlflow.start_run(log_system_metrics=True)).
    • Fix: Enable system metrics logging in relevant code sections (e.g., model training jobs) by using mlflow.start_run(log_system_metrics=True). This provides insights into the infrastructure resources used during model training and evaluation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant