Matching - AIC File Project

This project aims to enhance data quality, automate workflows, and improve accessibility for multiple projects, including Suplari Implementation, P Card, and Travel Analytics. The primary objective is to develop a robust, automated matching framework for AIC files that ensures consistent data across platforms, streamlining workflows and enabling accurate reporting and analysis.

Key Links and Resources

EDP Access and Registration

EDP Integrations and Connections

EDP Support and Communication

Finance and Data Platform Resources

Access Requests and Permissions

Project Structure

docs/ - Documentation files including data dictionary, architecture, and analysis notes.
config/ - Configuration files for database connections, environment settings, and mapping schemas.
sql/ - SQL scripts categorized into raw data extraction, transformations, aggregations, and matching logic.
etl/ - ETL pipeline code with modular functions for data normalization, matching, enrichment, and validation.
scripts/ - Utility scripts for validating matches, checking data quality, and generating reports.
data_samples/ - Sample data files (sanitized) for testing ETL processes, including raw and processed data.

Data Tables and Columns

The following tables and columns are key to the Matching - AIC File project:

Table Name	Key Columns
GL_DAILY_RATES	Conversion Date, From Currency, To Currency, Conversion Rate
HR_OPERATING_UNITS	Organization ID, Name, Location
HZ_CUST_ACCOUNTS	Account Number, Customer ID, Customer Name

Getting Started

Clone the repository.
Install required packages by running pip install -r requirements.txt.
Set up database connections by configuring db_connections.yaml in the config/ folder.
Prepare sample data in the data_samples/ folder for initial testing.

Running the ETL Pipeline

The main ETL pipeline is located in etl/main_etl_pipeline.py. You can run the entire ETL process or execute individual modules as needed:

python etl/main_etl_pipeline.py

Scripts

validate_matches.py - Checks matching accuracy and provides metrics on match rates.
check_data_quality.py - Runs data quality checks to identify inconsistencies.
generate_reports.py - Generates reports on the matching process and progress.

Key Documentation

data_dictionary.md - Detailed documentation of tables, fields, and relationships.
architecture.md - High-level overview of data architecture and system integration.
troubleshooting.md - Common issues and solutions for data and ETL challenges.

Contributing

For contributions, please adhere to the following guidelines:

Use branches for specific features or modules, and provide descriptive names (e.g., feature/fuzzy_matching).
Commit regularly with clear messages describing changes made.
Document any new functions or scripts added to the repository.

Contact

For any questions or further assistance, please reach out to the project lead, Scott Morgan.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Matching - AIC File Project

Key Links and Resources

EDP Access and Registration

EDP Integrations and Connections

EDP Support and Communication

Finance and Data Platform Resources

Access Requests and Permissions

Project Structure

Data Tables and Columns

Getting Started

Running the ETL Pipeline

Scripts

Key Documentation

Contributing

Contact

About

Uh oh!

Releases

Packages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
data_samples		data_samples
docs		docs
json		json
sql		sql
README.md		README.md

scomorgan/strategic-sourcing-data-enrichment

Folders and files

Latest commit

History

Repository files navigation

Matching - AIC File Project

Key Links and Resources

EDP Access and Registration

EDP Integrations and Connections

EDP Support and Communication

Finance and Data Platform Resources

Access Requests and Permissions

Project Structure

Data Tables and Columns

Getting Started

Running the ETL Pipeline

Scripts

Key Documentation

Contributing

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages