This repo was build to achive the next goals:
- Clean the client data
- Upload that data to the Yelo SaaS platform
- At the moment of the creation of this repo, the Yelo SaaS platform does not support the bulk upload of data via API.
- Upload the data as a CSV file is not an option because the data is too big (2M records) and the CSV doesn't support all the fields.
- There're only individual endpoints, so upload the data customer by customer takes too much time.
- Use an async and concurrent algorithm for speed
- Use a semaphore to avoid reaching the limit of the API (300 hits per sec)
- Only need to create the env vars for input/output files
Name | Description |
---|---|
YELO_API_KEY | API key |
YELO_API_BASE_URL | Base URL |
RAW_DATA_DIR | Directory where the raw data is stored |
RAW_DATA_FILE_NAME | Name of the file with the raw data |
CLEAN_DATA_DIR | Directory where the clean data is stored |
CLEAN_DATA_FILE_NAME | Name of the file with the clean data |
RESULTS_DIR | Directory where the results of the upload attempt are stored |
RESULTS_FILE_NAME | Name of the file with the results of the upload attempt |