data-science-task

#==============================================================================

@author: Dr. Leila Yousefi

#==============================================================================

Repository and Git Setup

Methods and Model - summary of the quantitative methods used for modelling.

Inputs

Outputs

Higher-level Process flow Diagrams - proccess flow diagrames of main inputs/outputs for the Model.

Aim

Objectives

Background Knowledge

Control Assumptions and Sensitivity Analysis

Data Sources

Model Calculation - details of model calculation

Feature Engineering and Data Preparation

Implementing the Demand Forecasting for LPA Model - details of model scripts

Repository and Git Setup

Repository layout

project-name/ ├── data/ # raw & processed data (git-ignored) │ ├── raw/ # original, untouched data │ └── processed/ # cleaned or transformed data outputs ├── notebooks/ # exploratory and analysis notebooks │ └── data-analysis.ipynb # data analysis notebook ├── src/ # Python modules and scripts │ ├── init.py # marks this folder as a package │ └── data_processing.py # reusable data-loading and cleaning functions │ └── data_quality.py # Data Quality Checks class ├── tests/ # Automated unit testing │ └── test_data_quality.py # pytest tests for Data Quality ├─ .github/ │ └─ workflows/ │ └─ ci.yml ├── .gitignore # Configure to exclude from Git /data/, environment folders, caches, and any large files. ├── README.md # project overview and setup instructions └── requirements.txt # pinned Python dependencies / freezed library versions to ensure consistent environments across machines.

implementation

git clone git@github.com:your-org/project-name.git
cd project-name
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Run tests locally

# Discover & run all tests in tests/
pytest --maxfail=1 --disable-warnings -q

or from Jupyter lab notebook:

!pytest -q

Automate via CI (GitHub Actions)

Create a file .github/workflows/ci.yml:

Discover & run all tests in tests/

pytest --maxfail=1 --disable-warnings -q -q gives you a concise report.

On success you’ll see something like === 10 passed in 0.5s ===.

On failure you’ll get full assertion tracebacks

Workflow

data: place raw files in data/raw/.
data: place cleaned / transformed data files in data/processed/.
notebooks: work in notebooks/data_analysis.ipynb.
modularise: once stable, move functions into src/.
version: commit early & often:
review: open a PR against main when ready.

git add .
git commit -m "EDA: initial missing-value summary"
git push origin main

Version Control Workflow
- Branching: Create a feature branch, e.g. feature/LY-analysis-01.
- Commits: After each major step—EDA, modeling, evaluation—commit with a clear message:
```
git add notebooks/01_analysis_template.ipynb src/data_processing.py
git commit -m "EDA: added missing-value visualization"
```
- Pull Request: When complete, open a PR against main and tag reviewers.
- commit code + tests + requirements.
- CI: add GitHub Actions workflow to enforce tests on every push.
- Verify on GitHub: Navigate to the Actions tab of your repository on GitHub. You should see the “CI” workflow queued or running.
Sharing with the Panel
- Push your branch to GitHub: git push origin feature/analysis-X.
- At discussion time, share the GitHub link to your notebook so the panel can view the commit history, code annotations, and outputs live.
create the file if it doesn't exist

touch .gitignore
# Remove data/ from the index (but leave the files on disk)
git rm -r --cached data/
# Commit the change
git commit -m "Remove data folder from tracking per .gitignore"

python src/analyzers.py

Methods and Model

Aim:

Objective: 1.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

data-science-task

@author: Dr. Leila Yousefi

Contents

Repository and Git Setup

Methods and Model - summary of the quantitative methods used for modelling.

Inputs

Outputs

Higher-level Process flow Diagrams - proccess flow diagrames of main inputs/outputs for the Model.

Aim

Objectives

Background Knowledge

Control Assumptions and Sensitivity Analysis

Data Sources

Model Calculation - details of model calculation

Feature Engineering and Data Preparation

Implementing the Demand Forecasting for LPA Model - details of model scripts

Repository and Git Setup

Repository layout

implementation

Run tests locally

Automate via CI (GitHub Actions)

Discover & run all tests in tests/

Workflow

Methods and Model

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
.ipynb_checkpoints		.ipynb_checkpoints
.venv		.venv
notebooks		notebooks
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

moj-analytical-services/data-science-task

Folders and files

Latest commit

History

Repository files navigation

data-science-task

@author: Dr. Leila Yousefi

Contents

Methods and Model - summary of the quantitative methods used for modelling.

Higher-level Process flow Diagrams - proccess flow diagrames of main inputs/outputs for the Model.

Model Calculation - details of model calculation

Implementing the Demand Forecasting for LPA Model - details of model scripts

Repository and Git Setup

Repository layout

implementation

Run tests locally

Automate via CI (GitHub Actions)

Discover & run all tests in tests/

Workflow

Methods and Model

About

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages