Skip to content

Commit ed78847

Browse files
Update README.md
1 parent a3ba805 commit ed78847

File tree

1 file changed

+26
-18
lines changed

1 file changed

+26
-18
lines changed

README.md

Lines changed: 26 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ The pipeline follows a multi-layer data architecture:
1212
1. **Raw Layer** - Contains raw CO2 data loaded directly from source files
1313
2. **Harmonized Layer** - Standardized data with consistent formatting and data quality checks
1414
3. **Analytics Layer** - Derived tables with aggregations, metrics, and enriched attributes for analysis
15-
15+
4. **External Layer** - Storing all the stages and for implementing external access integration and policies for external outbound network call.
1616
### Key Components:
1717

1818
- **Raw Data Ingestion** - Loads CO2 data from S3 into the raw layer
@@ -28,6 +28,7 @@ The pipeline follows a multi-layer data architecture:
2828
- **Snowpark** - Snowflake's Python API for data processing
2929
- **GitHub Actions** - CI/CD pipeline
3030
- **AWS S3** - Data storage for source files
31+
- **AWS Lambda** - Creating Lambda function with API Gateway for routing network api calls
3132
- **pytest** - Testing framework
3233

3334
## Setup and Installation
@@ -44,28 +45,33 @@ The pipeline follows a multi-layer data architecture:
4445
1. Clone the repository:
4546

4647
```bash
47-
git clone <repository-url>
48+
git clone https://github.com/BigDataTeam5/Incremental_DataPipleine_using_Snowflake.git
4849
cd Incremental_DataPipleine_using_Snowflake
4950
```
5051

51-
2. Create and activate a virtual environment:
52-
53-
```bash
54-
python -m venv venv
55-
source venv/bin/activate # On Windows: venv\Scripts\activate
56-
```
52+
2. Create and activate a virtual environment using poetry:
53+
### Windows Installation
54+
```
55+
# Using PowerShell
56+
(Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | python -
57+
```
58+
after installing,
59+
```
60+
cd Incremental_DataPipleine_using_Snowflake
61+
poetry show
62+
```
63+
run any python file along with poetry command
64+
```
65+
poetry run python <your_script>.py
66+
```
5767
58-
3. Install dependencies:
5968
60-
```bash
61-
pip install -r requirements.txt
62-
```
6369
6470
4. Set up RSA key pair authentication:
6571
6672
```bash
6773
mkdir -p ~/.snowflake/keys
68-
python scripts/rsa_key_pair_authentication/generate_snowflake_keys.py
74+
poetry python scripts/rsa_key_pair_authentication/generate_snowflake_keys.py
6975
```
7076

7177
5. Configure Snowflake connection by creating `~/.snowflake/connections.toml`:
@@ -74,6 +80,7 @@ python scripts/rsa_key_pair_authentication/generate_snowflake_keys.py
7480
[dev]
7581
account = "your-account"
7682
user = "your-username"
83+
password= "your-password"
7784
private_key_path = "~/.snowflake/keys/rsa_key.p8"
7885
warehouse = "CO2_WH_DEV"
7986
role = "CO2_ROLE_DEV"
@@ -84,6 +91,7 @@ client_request_mfa_token = false
8491
[prod]
8592
account = "your-account"
8693
user = "your-username"
94+
password= "your-password"
8795
private_key_path = "~/.snowflake/keys/rsa_key.p8"
8896
warehouse = "CO2_WH_PROD"
8997
role = "CO2_ROLE_PROD"
@@ -122,7 +130,7 @@ ALTER USER YourUsername SET RSA_PUBLIC_KEY='<public-key-string>';
122130
2. Create required Snowflake resources:
123131

124132
```bash
125-
python scripts/deployment_files/snowflake_deployer.py sql --profile dev --file scripts/setup_dev.sql
133+
poetry run python scripts/deployment_files/snowflake_deployer.py sql --profile dev --file scripts/setup_dev.sql
126134
```
127135

128136
## Project Structure
@@ -157,13 +165,13 @@ Incremental_DataPipleine_using_Snowflake/
157165
### Loading Raw Data
158166

159167
```bash
160-
python scripts/raw\ data\ loading\ and\ stream\ creation/raw_co2_data.py
168+
poetry run python scripts/raw\ data\ loading\ and\ stream\ creation/raw_co2_data.py
161169
```
162170

163171
### Creating Streams for Change Data Capture
164172

165173
```bash
166-
python scripts/raw\ data\ loading\ and\ stream\ creation/02_create_rawco2data_stream.py
174+
poetry run python scripts/raw\ data\ loading\ and\ stream\ creation/02_create_rawco2data_stream.py
167175
```
168176

169177
### Running Tests
@@ -176,7 +184,7 @@ pytest tests/
176184

177185
Deploy all components:
178186
```bash
179-
python scripts/deployment_files/snowflake_deployer.py deploy-all --profile dev --path udfs_and_spoc --check-changes
187+
poetry run python scripts/deployment_files/snowflake_deployer.py deploy-all --profile dev --path udfs_and_spoc --check-changes
180188
```
181189

182190
Deploy a specific component:
@@ -218,7 +226,7 @@ Common issues and solutions:
218226

219227
- **Authentication Errors**: Verify key permissions and format
220228
- **Deployment Failures**: Check function signatures and parameter counts
221-
- **Connection Issues**: Run `python scripts/deployment_files/check_connections_file.py` to validate your connections.toml
229+
- **Connection Issues**: Run `poetry run python scripts/deployment_files/check_connections_file.py` to validate your connections.toml
222230

223231
## Contributing
224232

0 commit comments

Comments
 (0)