Skip to content

This project contains a working streamlit web app to visualize and work with geo data from the public dataset provided by the RIVM of chemical concentrations measured using water samples.

Notifications You must be signed in to change notification settings

FutureFacts/politie-watervervuilers

Repository files navigation

Water Polluters (Politie watervervuilers) Project

Description

This project contains a working streamlit web app to visualize and work with geo data from the dataset provided by the RIVM of chemical concentrations measured using water samples. The project also contains several notebooks that were used, among other things, to explore the dataset, experiment with factorization machines to fill gaps in the scarce dataset, explore the hydrological structure of the Dutch waters to connect the different locations and to perform feature engineering on the dataset.

The project is set to be public, as requested by the client: Nationale Politie.

General info

This project contains basic Python project features:

  • structure of the python project
    • readme and changelog (README.md, CHANGELOG.md)
    • package namespacing (e.g. ff)
  • running multiple checks
    • code style
    • type checking with mypy
    • formatting and linting with ruff

Installation

  1. Setup

    Follow the instructions in the explanation folder regarding the set-up of a Python project on your local machine.

  2. Data

    Currently, you will need to download the dataset into the data folder. The data can be downloaded from the Azure Blob Storage through a SAS which can be requested to one of the repo admins. Use the Azure Storage Explorer to view and download the data with the SAS.

    Download the data directory from the ff-politie-pollution blob.

    Make sure to move the downloaded data and file structure as-is to the data folder.

Usage

To use this repository, perform the following steps:

  • Clone the repository
  • Start developing

Development guide

Requirements

  • Supported Python versions (see pyproject.toml )installed. See HERE.
  • The poetry dependency manager. Install as described HERE

Setup

  • Make sure you have poetry installed
  • Make sure you work with pyenv and that you have the correct python --version
  • Make sure you work in a python virtual environment
    • On MacOS and Linux poetry can make a virtual environment for you automatically.
    • On windows I would install one manually via venv: python -m venv .venv
  • poetry install
  • poetry run pre-commit install

make/run script?

On Windows a Makefile might not work for you. So we've added scripts/make.ps1 for Windows. It should work almost the same way.

Checks

To run all checks (tests, linting, etc), run:

make check

Tests

To run the tests, use the following command:

make pytest

Type checking

make type-check

Style and formatting

ruff can be used to automatically format code, so you don't have to worry about the nitty-gritty of code style.

make check_ruff

To fix issues in the repository you can format everything with ruff via this command:

make format

Contributing

To contribute, create a pull request. Before merging, all checks must pass successfully, check the github actions workflows.

Please update the CHANGLELOG.md and the version in pyproject.toml accordingly.

Tags

Streamlit, Data Science, Web App, Geo Data, Water pollution, Outlier detection.

About

This project contains a working streamlit web app to visualize and work with geo data from the public dataset provided by the RIVM of chemical concentrations measured using water samples.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages