The Preprocessor project is a flexible and modular application for preprocessing time series data. It supports dynamic loading of plugins for various preprocessing tasks such as normalization, unbiasing, trimming, and feature selection. Each plugin can save and load its parameters for consistent preprocessing across different datasets.
To install and set up the Preprocessor application, follow these steps:
-
Clone the Repository:
git clone https://github.com/harveybc/preprocessor.git cd preprocessor
-
Create and Activate a Virtual Environment:
-
Using
venv
(Python 3.3+):python -m venv env source env/bin/activate # On Windows use `env\Scripts\activate`
-
Using
conda
:conda create --name preprocessor_env python=3.9 conda activate preprocessor_env
-
-
Install Dependencies:
pip install --upgrade pip pip install -r requirements.txt
-
Build the Package:
python -m build
-
Install the Package:
pip install .
-
Run the Preprocessor:
- On Windows, run the following command to verify installation (it uses all default values, use preprocessor.bat --help, for complete command line arguments description):
preprocessor.bat
- On Linux, run:
sh preprocessor.sh
- On Windows, run the following command to verify installation (it uses all default values, use preprocessor.bat --help, for complete command line arguments description):
-
Run Tests (Optional, requires external repo): For pasing remote tests, requires an instance of harveybc/data-logger
-
On Windows, run the following command to run the tests:
set_env.bat pytest
-
On Linux, run:
sh ./set_env.sh pytest
-
-
Generate Documentation (Optional):
- Run the following command to generate code documentation in HTML format in the docs directory:
pdoc --html -o docs app
- Run the following command to generate code documentation in HTML format in the docs directory:
This command should display the help message for the Preprocessor application, confirming that the installation was successful.
You can use the Preprocessor application from the command line with various plugins. Below are some examples:
preprocessor.bat
Use the -h or --help parameter to get help on the parameters.
preprocessor/
│
├── app/ # Main application package
│ ├── __init__.py # Initializes the Python package
│ ├── main.py # Entry point for the application
│ ├── config.py # Configuration settings for the app
│ ├── cli.py # Command line interface handling
│ ├── data_handler.py # Module to handle data loading
│ ├── default_plugin.py # Default plugin (normalizer)
│ └── plugins/ # Plugins directory
│ ├── __init__.py # Makes plugins a Python package
│ ├── plugin_unbiaser.py
│ ├── plugin_trimmer.py
│ ├── plugin_feature_selector_pre.py
│ └── plugin_feature_selector_post.py
│
├── tests/ # Test modules for your application
│ ├── __init__.py # Initializes the Python package for tests
│ ├── test_preprocessor.py # Tests for preprocessor functionality
│ ├── datasets/ # Test datasets directory
│ └── configs/ # Test configurations directory
│
├── setup.py # Setup file for the package installation
├── README.md # Project description and instructions
├── requirements.txt # External packages needed
└── .gitignore # Specifies intentionally untracked files to ignore