⚡️ CDIA Nexus PUC-SP: Innovation Hub for Smart Water and Energy Solutions ⚡️ Smart City Laguna IoT, Fortaleza, Brazil ⚡️
Project for monitoring, forecasting, and optimizing energy consumption in a smart home, using IoT and AI. Developed in the context of Smart City Laguna – CDIA PUC-SP.
In collaboration with Planet Smart City, PUC-SP - Data Science & AI, UN Sustainable Development Goals (SDGs), Starlink and Proptech Brazil
Connected.4.Good.-.IoT.in.Planet.SmartCity.mp4
-
Projects and deliverables may be made publicly available whenever possible.
-
The course prioritizes hands-on practice with real data in consulting scenarios.
-
All activities comply with the academic and ethical guidelines of PUC-SP.
-
Confidential information from this repository remains private in private repositories.
Explore the Simulator and support the Smart Cities AI project
⇩💦
Developed by the CDIA group at PUC-SP, this extension project aims to optimize smart resource management systems in Smart City Laguna — combining technology, sustainability, and community innovation to empower underserved regions.
With a strong foundation in interdisciplinary collaboration and international cooperation, this initiative bridges data science with real-world applications to foster resilient, inclusive, and intelligent cities.
Founded in 2015 by Giovanni Savio and Susanna Marchionni, Planet Smart City leads the global movement for affordable, smart, and sustainable housing. Their projects combine:
- Advanced urban design
- Integrated technology
- Community-building initiatives
The CDIA Nexus is an initiative by the Data Science and Artificial Intelligence Group at PUC-SP, dedicated to developing applied AI and IoT solutions for smart water and energy management.
This innovation hub integrates applied research, university outreach, and social impact, focusing on transforming communities through purposeful technology.
The solutions developed are applied in real contexts, such as Smart City Laguna (Fortaleza, Brazil), through projects in partnership with organizations like Planet Smart City, UN-Habitat, Starlink, among others.
The project aims to combine sustainability, digital inclusion, and social innovation, promoting cities that are more resilient, efficient, and people-centered.
💥 From Code to Insight: Data Analysis and Decision Support:
To develop a data science and AI-based solution to monitor, forecast, and optimize electricity consumption in a smart home (Smart City Laguna). The project simulates sensor data per room and uses machine learning to anticipate consumption patterns and propose saving actions.
☞ Tap here to access the dataset!
A simulated dataset was used, containing daily records with the following variables:
Date
: Day of measurementKW/H
: Total energy consumption in kWhRoom1
,Room2
,LivingRoom
,Kitchen
,Pool
: Number of sensor activations per roomSolarGeneration
: Energy generated by solar panels (simulated)
“How can we predict daily energy consumption based on room-specific behavior and, from that, propose automated measures for energy savings and efficiency?”
-
Reading the spreadsheet using
pandas
and validating formats.
-
- Converting the
Date
column todatetime
format - Creating the
OrdinalDay
variable for modeling - Calculating average consumption per activation per room
- Simulating solar generation and projecting future consumption
- Converting the
-
A Linear Regression model was trained to estimate consumption (
KW/H
) based on total room activations. It also includes next-day prediction.
-
- Time series plots with
matplotlib
/seaborn
- Ranking of rooms with highest consumption
- Activation patterns by cluster
- Interactive dashboard using Streamlit for real-time monitoring (optional)
- Time series plots with
-
Automatic generation of PDF reports with relevant data, charts, and forecasts.
- The regression model showed good ability to predict consumption based on room activity
- Living Room and Kitchen were identified as the highest impact areas
- The Pool, although rarely activated, had high average consumption per activation — indicating waste; it was removed from the model, since the Laguna project focuses on social housing and does not include pools
- Solar generation can significantly offset peak-time consumption if properly managed
- Automate power shut-offs in high-usage areas like the living room and kitchen to achieve immediate savings
- Schedule pool usage to mitigate unnecessary consumption peaks
- Leverage solar generation to balance appliance usage during peak production hours
- Implement alerts when daily consumption targets are exceeded
- Streamlit app for real-time sensor monitoring
- PDF report with consumption metrics and recommendations
- Notebook containing full data pipeline, predictive model, and visual analyses
- Real-time dashboard displaying room-based sensor data
- Daily energy consumption forecasting via Linear Regression
- Simulated sensors per room (Room1, Room2, Living Room, Kitchen)
- Daily consumption target with alert system
- Auto-refresh system using
streamlit_autorefresh
- Usage pattern clustering using KMeans + PCA
- PDF report export
- Comparison with simulated solar generation
- Python
- Pandas and NumPy – data processing and analysis
- Scikit-learn – linear regression and KMeans
- Matplotlib, Seaborn, and Plotly – visualizations
- Streamlit – interactive dashboard
- FPDF – PDF report generation
- Pillow – dashboard image rendering
laguna_city_digital/
├── app.py # Main Streamlit app
├── consumo_model.pkl # Trained prediction model
├── cluster_model.pkl # Trained KMeans model
├── dados/
│ └── Consumo_de_Energia_Analise.xlsx # Simulated room data
├── relatorios/
│ └── relatorio_consumo_YYYY-MM-DD.pdf
├── imagens/
│ ├── grafico_pca.png
│ ├── heatmap_cluster.png
│ └── grafico_regressao.png
└── README.md
-
PCA Clustering
Distribution of usage patterns by energy profile -
Activation Heatmap
Usage percentage by room -
Actual vs Predicted Chart
Evaluation of prediction accuracy
R²
: 0.70RMSE
: 11,528.06Most influential room
: Living Room (28.21%)
- Support for urban energy sustainability
- Scalable foundation for full Smart City deployment
📌 This analysis was developed using data science practices applied to residential energy consumption, aiming to support decision-making for the end user.
CDIA Nexus is the final academic and social extension project of the PUC-SP Data Science and Artificial Intelligence Group, focused on applying IoT and AI for smart water and energy systems in Smart City Laguna, a pioneering urban development in Fortaleza, Brazil.
This initiative was developed in partnership with Planet Smart City, UN-Habitat, and Starlink, aligned with the UN Sustainable Development Goals (SDGs) and committed to social innovation, digital inclusion, and environmental intelligence.
- An integrated Water & Energy Monitoring Dashboard
- Predictive analytics with AI models
- Community engagement through data-driven strategies
- Deployment insights using Starlink connectivity and Planet infrastructure
🌟 Key contributor: Stefano Buono, physicist and entrepreneur, former CERN researcher and founder of AAA (sold to Novartis), now President of LIFTT and CEO of Newcleo (clean nuclear innovation).
➢ Visit Planet Smart City - Oficial 🇮🇹
➢ Visit Planet Smart City - Brazil 🇧🇷
➢ Visit Planet Smart City - India 🇮🇳
Located in São Gonçalo do Amarante, Ceará, Fortaleza, Brazil; Smart City Laguna is a Planet’s flagship smart city in Brazil, featuring over 60 smart solutions, including:
- Public Wi-Fi and IoT backbone
- Sustainable urban mobility and lighting
- Rainwater drainage with permeable pavements
- Cultural, educational, and governance programs
The Community Manager — a trained professional dedicated to:
- Mobilizing participatory governance
- Promoting workshops, education, and engagement
- Nurturing social cohesion and long-term stewardship
Special thanks to Pedro Braida Neto, CEO of Proptech Brazil, for leading with empathy, respect, and integrity. The way you show up for others truly makes all the difference.
We extend our heartfelt gratitude to the organizations and individuals who made the implementation of the CDIA PUC-SP possible. Special thanks to:
Organization | Contribution |
---|---|
United Nations (UN) | Funding for the acquisition of solar panels |
PUC-SP (CDIA) | IoT and AI design and implementation |
UN-Habitat | Technical support and ethical frameworks |
Starlink | Satellite internet infrastructure |
Planet Smart City | Urban development and on-site support |
Proptech Brazil | Local deployment and strategic backing |
- Local leaders and community members for their trust and ongoing collaboration.
- The multidisciplinary technical team for their dedication to innovative and sustainable solutions.
- Everyone who contributed, directly or indirectly, to bringing this vision to life.
Together, these partners embody an integrated approach to achieving the Sustainable Development Goals, particularly in emerging regions. 💙🌎
The Water & Energy Module designed by CDIA focuses on the use of IoT and AI for resource optimization. Key features:
- Smart sensors for consumption monitoring
- AI-driven dashboards with predictive alerts
- Visualizations for community awareness
- Scalable resource management models
🧑🏼🚀 Team Members:
Name | Role |
---|---|
Andson Ribeiro | Github - Contact |
Fabiana 🧬 Campanari | Github - Contact Hub |
Leonardo X Fernandes | Github - Contact |
Pedro Vyctor Almeida | Github - Contact |
💙 All members contributed collaboratively across technical and creative areas. Fabiana 🧬 Campanari also led the project’s identity and visual language.
- Installation of IoT sensors for water and energy monitoring
- Development of predictive dashboards and alert systems
- Co-creation of a data visualization interface for residents
- Deployment of solar energy pilots supported by UN-Habitat
- Real-time analytics for resource planning and sustainability
The team gained hands-on experience in:
- Design thinking + participatory methodologies
- Field research in urban infrastructure
- Machine learning and data modeling
- Prototyping and system integration
- Delivering solutions that reflect real-world community needs
drone_view_laguna_2025.jpg
– Aerial view of the cityteam_workshop_on_site.jpeg
– Field activities with residentssolar_panels_community.jpeg
– UN-backed solar deploymentiot_dashboard_mockup.png
– Dashboard design preview
CDIA_Final_Pitch.pdf
– Core insights and resultsUN_SolarInvestment_Laguna.pptx
– Stakeholder presentationIoT_Architecture_Prototype.pptx
– Sensor and data flow architecture
We extend our sincere thanks to Pedro from Proptech, whose guidance and expertise were instrumental throughout this project. His support and vision were essential pillars of our development journey.
☞ tap here to get the dataset
A simulated dataset was used, containing daily records with the following variables:
Data
: Measurement dayKW/H
: Total energy consumption in kWhQuarto1
,Quarto2
,Sala
,Cozinha
,Piscina
: Number of sensor activations in each roomGeração Solar
: Energy generated by solar panels (simulated)
A [Linear Regression] model was trained to estimate consumption (KW/H
) based on total activations per room. Next-day prediction was also implemented.
Visualizations:
- Time series charts with matplotlib
/seaborn
.
- Ranking of highest-consuming rooms.
- Representations of activations by cluster.
- Interactive Streamlit dashboard for real-time visualization (optional).
Cell 1 - Import libraries
import locale
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error, r2_score, mean_squared_error
Cell 2 - Data loading
# Change the path according to your environment
file_path = "/Users/fabicampanari/Desktop/Project Planet Smart City Laguna/2-CRISP-DM - Project Smart City Laguna/🇧🇷 CRISP-DM_Projeto_Smart_City_Laguna/Consumo_de_Energia_Analise.xlsx"
xls = pd.ExcelFile(file_path)
sheet_names = xls.sheet_names
print(sheet_names)
df = xls.parse('Sheet1')
print(df.head())
df.info()
Cell 3 - Date preprocessing
meses_pt = {
'jan': '01', 'fev': '02', 'mar': '03', 'abr': '04',
'mai': '05', 'jun': '06', 'jul': '07',
}
df['Data'] = df['Data'].astype(str)
df['Data'] = df['Data'].str.lower().replace(meses_pt, regex=True)
df['Data'] = pd.to_datetime(df['Data'] + '/2025', format='%d/%m/%Y')
Cell 4 - Descriptive statistics and correlation
summary = df.describe()
correlation = df.corr(numeric_only=True)
print(summary)
print(correlation)
Cell 5: PLOT 1 - Variable distributions
fig, axes = plt.subplots(2, 3, figsize=(15, 10))
axes = axes.flatten()
cols = df.columns[1:7]
for i, col in enumerate(cols):
sns.histplot(df[col], kde=True, ax=axes[i], bins=10)
axes[i].set_title(f'Distribution - {col}')
axes[i].set_xlabel(col)
plt.tight_layout()
plt.suptitle("Variable Distributions", fontsize=16, y=1.02)
plt.show()
( Data Total Consumption (kWh) Bedroom 1 Bedroom 2 \
count 211 211.000000 211.000000 211.000000
mean 2025-04-16 00:00:00 1188.317536 9.687204 9.549763
min 2025-01-01 00:00:00 644.000000 2.000000 0.000000
25% 2025-02-22 12:00:00 1057.000000 8.000000 7.000000
50% 2025-04-16 00:00:00 1176.000000 9.000000 9.000000
75% 2025-06-07 12:00:00 1324.000000 12.000000 11.500000
max 2025-07-30 00:00:00 1667.000000 21.000000 21.000000
std NaN 197.439318 3.176817 3.073874
Living Room Kitchen Pool
count 211.000000 211.000000 211.000000
mean 9.445498 9.322275 9.383886
min 2.000000 3.000000 1.000000
25% 7.000000 7.000000 7.000000
50% 9.000000 9.000000 9.000000
75% 12.000000 11.000000 11.000000
max 17.000000 18.000000 19.000000
std 3.033247 2.969757 3.436433 ,
Total Consumption (kWh) Bedroom 1 Bedroom 2 \
Total Consumption (kWh) 1.000000 0.521439 0.418033
Bedroom 1 0.521439 1.000000 0.060606
Bedroom 2 0.418033 0.060606 1.000000
Living Room 0.548475 0.169207 0.038469
Kitchen 0.409667 0.071809 0.049356
...
Bedroom 1 0.169207 0.071809 0.064704
Bedroom 2 0.038469 0.049356 0.012383
Living Room 1.000000 0.136760 0.042904
Kitchen 0.136760 1.000000 -0.083571
Pool 0.042904 -0.083571 1.000000 )
Cell 6: PLOT 2 - Total consumption over time
plt.figure(figsize=(14, 6))
plt.plot(df['Data'], df['KW/H'], label='Total Consumption (KW/H)', color='blue', linewidth=2)
plt.title('Total Energy Consumption Over Time')
plt.xlabel('Date')
plt.ylabel('KW/H')
plt.grid(True)
plt.legend()
plt.tight_layout()
plt.show()
Cell 7: Weekly grouping and PLOT 3 - Weekly activations per room
df['Semana'] = df['Data'].dt.to_period('W').apply(lambda r: r.start_time)
df_semana = df.groupby('Semana')[['Quarto1', 'Quarto2', 'Sala', 'Cozinha', 'Piscina']].sum()
df_semana.plot(figsize=(12, 6), marker='o')
plt.title('Weekly Activations per Room')
plt.ylabel('Number of Activations')
plt.xlabel('Week')
plt.xticks(rotation=45)
plt.grid(True)
plt.tight_layout()
plt.show()
Cell 8: PLOT 4 - Correlation between activations and consumption
correlations = df[['KW/H', 'Quarto1', 'Quarto2', 'Sala', 'Cozinha', 'Piscina']].corr()['KW/H'][1:]
plt.figure(figsize=(10, 5))
sns.barplot(x=correlations.index, y=correlations.values, palette='Oranges_r')
plt.title('Correlation between Activations and Energy Consumption (kWh)')
plt.ylabel('Correlation')
plt.xlabel('Room')
plt.tight_layout()
plt.show()
Cell 9: Predictive modeling - Linear Regression and Evaluation
X = df[['Quarto1', 'Quarto2', 'Sala', 'Cozinha']]
y = df['KW/H']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = LinearRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print("Mean Squared Error (MSE):", round(mse, 2))
print("R² Score:", round(r2, 2))
Mean Squared Error (MSE): 11528.06
Coefficient of Determination (R²): 0.7
Cell 10: PLOT 5 - Actual vs Predicted Consumption
plt.figure(figsize=(10, 5))
plt.scatter(y_test, y_pred, alpha=0.7)
plt.plot([y.min(), y.max()], [y.min(), y.max()], 'r--')
plt.xlabel("Actual Consumption (kWh)")
plt.ylabel("Predicted Consumption (kWh)")
plt.title("Actual vs Predicted Consumption")
plt.grid(True)
plt.tight_layout()
plt.show()
Contribution of each room to the prediction (coefficients):
Sala 28.214005
Quarto2 24.141777
Quarto1 23.279552
Cozinha 20.993133
dtype: float64
Cell 11 - Model coefficients
coefficients = pd.Series(model.coef_, index=X.columns)
print("\nContribution of each room to the prediction (coefficients):")
print(coefficients.sort_values(ascending=False))
Shows the weight of each room in the consumption prediction.
Cell 12 - Calculate activation percentages per room
df['Total_activations'] = df[['Quarto1', 'Quarto2', 'Sala', 'Cozinha']].sum(axis=1)
for room in ['Quarto1', 'Quarto2', 'Sala', 'Cozinha', 'Piscina']:
df[f'{room}_pct'] = df[room] / df['Total_activations']
Cell 13: PLOT 6 - Elbow Method for KMeans
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
inertia = []
for k in range(1, 10):
km = KMeans(n_clusters=k, random_state=42)
km.fit(X_scaled)
inertia.append(km.inertia_)
plt.figure(figsize=(8,5))
plt.plot(range(1, 10), inertia, marker='o')
plt.title('Elbow Method')
plt.xlabel('Number of clusters')
plt.ylabel('Inertia')
plt.grid(True)
plt.show()
Cell 14: KMeans and PLOT 7 - Pairplot of clusters
# Apply KMeans with the chosen number of clusters
kmeans = KMeans(n_clusters=3, random_state=42)
df['Cluster'] = kmeans.fit_predict(X_scaled)
sns.pairplot(df, hue='Cluster', vars=['Quarto1', 'Quarto2', 'Sala', 'Cozinha'], palette='tab10')
plt.suptitle("Usage Patterns Grouped by Cluster", y=1.02)
plt.show()
Cell 15 - Average profile per cluster and naming
# Calculate average profile per cluster (with activations and percentages)
col_pcts = [f'{c}_pct' for c in ['Quarto1', 'Quarto2', 'Sala', 'Cozinha']]
perfil_clusters = df.groupby('Cluster')[['Quarto1', 'Quarto2', 'Sala', 'Cozinha', 'KW/H'] + col_pcts].mean()
# Function to name the cluster profile considering consumption and activation percentage
def name_cluster(row):
mean_kw = df['KW/H'].mean()
if row['KW/H'] < mean_kw * 0.75:
total_consumption = '🔵 Low Consumption'
elif row['KW/H'] > mean_kw * 1.25:
total_consumption = '🔴 High Consumption'
else:
total_consumption = '🟡 Balanced Consumption'
high = []
for room in ['Quarto1', 'Quarto2', 'Sala', 'Cozinha']:
mean_pct = df[f'{room}_pct'].mean()
if row[f'{room}_pct'] > mean_pct * 1.2:
high.append(room)
if total_consumption == '🔵 Low Consumption':
return total_consumption
if total_consumption == '🟡 Balanced Consumption':
if len(high) == 0:
return total_consumption
else:
return f"🟠 High Consumption in {', '.join(high)}"
if total_consumption == '🔴 High Consumption':
if len(high) == 0:
return total_consumption
else:
return f"🔴 High Consumption (In {', '.join(high)})"
perfil_clusters['Profile'] = perfil_clusters.apply(name_cluster, axis=1)
Cell 16 - Recommendations dictionary and display by cluster
# Function to map profile to recommendation dictionary key
def map_profile_to_key(profile):
if profile == '🔵 Low Consumption':
return profile
if profile == '🟡 Balanced Consumption':
return profile
if profile.startswith('🟠 High Consumption'):
return '🟠 High Consumption'
if profile.startswith('🔴 High Consumption'):
if 'In' in profile:
idx = profile.index('In') + 3
text = profile[idx:]
main = text.split(',')[^0].strip()
if main in ['Sala']:
return '🔴 High Consumption (Sala/Cozinha)'
elif main == 'Cozinha':
return '🔴 High Consumption (Cozinha)'
else:
return '🔴 High Consumption'
else:
return '🔴 High Consumption'
return profile
# Dictionary with recommendations per profile
recommendations = {
'🔵 Low Consumption': [
"✅ Maintain current good practices.",
"🎁 Offer rewards or discounts (gamification).",
"🔋 Encourage use of solar energy / microgeneration."
],
'🟡 Balanced Consumption': [
"🔌 Automate device shutdown at fixed times.",
"🕵️ Install presence sensors in bedrooms and living room.",
"📊 Send weekly comparative usage reports."
],
'🟠 High Consumption': [
"🛏️ Automate lights and electronics in high-consumption rooms.",
"🕵️ Install specific presence sensors for the rooms.",
"📊 Monitor usage to identify unnecessary peaks."
],
'🔴 High Consumption (Sala/Cozinha)': [
"💧 Schedule kitchen pump operation outside peak hours.",
"💡 Encourage conscious use of lighting and electronics.",
"🧠 Suggest automation and white tariff subscription."
],
'🔴 High Consumption (Cozinha)': [
"🍳 Check kitchen appliances for excessive consumption.",
"⏰ Control oven and refrigerator usage times.",
"💡 Encourage efficient lighting use."
]
}
# Display profiles and recommendations
for cluster_id, row in perfil_clusters.iterrows():
print(f"\n=== Cluster {cluster_id} - {row['Profile']} ===")
print("📊 Average consumption profile (activations and kWh):")
print(row[['Quarto1', 'Quarto2', 'Sala', 'Cozinha', 'KW/H']])
print("\n📈 Average percentage of activations per room (%):")
print((row[col_pcts] * 100).round(2))
print("\n💡 Recommendations:")
key = map_profile_to_key(row['Profile'])
if key in recommendations:
for rec in recommendations[key]:
print("-", rec)
else:
print("- No specific recommendations for this profile.")
=== Cluster 0 - 🟡 Balanced Consumption ===
📊 Average consumption profile (activations and kWh):
Quarto1 7.747126
Quarto2 8.034483
Sala 7.908046
Cozinha 8.011494
KW/H 1047.402299
Name: 0, dtype: object
📈 Average percentage of activations per room (%):
Quarto1_pct 24.481434
Quarto2_pct 25.376404
Sala_pct 24.970192
Cozinha_pct 25.17197
Name: 0, dtype: object
💡 Recommendations:
- 🔌 Automate turning off equipment at fixed times.
- 🕵️ Install presence sensors in bedrooms and living room.
- 📊 Send weekly comparative usage reports.
=== Cluster 1 - 🟡 Balanced Consumption ===
📊 Average consumption profile (activations and kWh):
Quarto1 9.830508
...
💡 Recommendations:
- 🔌 Automate turning off equipment at fixed times.
- 🕵️ Install presence sensors in bedrooms and living room.
- 📊 Send weekly comparative usage reports.
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...
Cell 17: PLOT 8 - Boxplot of consumption by cluster
plt.figure(figsize=(7,5))
sns.boxplot(x='Cluster', y='KW/H', data=df)
plt.title('Consumption Distribution (KW/H) by Cluster')
plt.show()
Cell 18: PLOT 9 - Heatmap of percentages by cluster
heatmap_data = perfil_clusters[col_pcts] * 100
plt.figure(figsize=(8, 5))
sns.heatmap(heatmap_data, annot=True, cmap='YlGnBu', fmt=".2f")
plt.title('Percentage of Activations per Room (%)')
plt.xlabel('Rooms')
plt.ylabel('Cluster')
plt.show()
Cell 19: PLOT 10 - Radar chart of rooms by cluster
categories = ['Quarto1', 'Quarto2', 'Sala', 'Cozinha']
angles = np.linspace(0, 2 * np.pi, len(categories), endpoint=False).tolist()
angles += angles[:1]
plt.figure(figsize=(10, 8))
for i, row in perfil_clusters.iterrows():
values = [row[cat] for cat in categories]
values += values[:1]
plt.polar(angles, values, label=f'Cluster {i}')
plt.xticks(angles[:-1], categories)
plt.title('Radar of Rooms by Cluster')
plt.legend()
plt.show()
Cell 20: PLOT 11 - Cluster visualization with PCA
We applied PCA demonstratively, even with only two clusters, to show how it works in dimensionality reduction and highlighting important variables.
Although not essential here, PCA is useful for datasets with many columns or more than two clusters, improving performance and data visualization.
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X_scaled)
df_plot = pd.DataFrame(X_pca, columns=['Component 1', 'Component 2'])
df_plot['Cluster'] = df['Cluster']
plt.figure(figsize=(8,6))
for cluster in df_plot['Cluster'].unique():
plt.scatter(
df_plot[df_plot['Cluster'] == cluster]['Component 1'],
df_plot[df_plot['Cluster'] == cluster]['Component 2'],
label=f'Cluster {cluster}'
)
plt.title('Cluster Visualization with PCA')
plt.xlabel('Component 1')
plt.ylabel('Component 2')
plt.legend()
plt.grid(True)
plt.show()
- Variable Distribution: Shows how activations and consumption are distributed.
- Temporal Evolution: Allows identification of consumption trends over the days.
- Weekly Activations: Helps visualize patterns by room.
- Correlation: Shows the strength of the relationship between activations and consumption.
- Actual vs Predicted Consumption: Evaluates the quality of the predictive model.
- Clustering: Identifies groups with similar behavior for personalized recommendations.
Profile | Main Recommendations |
---|---|
🔵 Low Consumption | Maintain good practices, encourage solar energy, rewards/gamification |
🟡 Balanced Consumption | Automate shutdowns, install presence sensors, comparative reports |
🟠 High Consumption | Automate lights/electronics, specific presence sensors, monitor peaks |
🔴 High Consumption (Living Room/Kitchen) | Schedule pump outside peak, conscious lighting use, suggest automation and time-of-use tariff |
🔴 High Consumption (Kitchen) | Check equipment, control usage times, encourage lighting efficiency |
🧭 Conclusion:
The project enables identifying consumption patterns, forecasting future use, and recommending actions for greater energy efficiency, customizing recommendations according to each residence’s usage profile.
Note:
Adjust the Excel file path (file_path
) according to your environment.
This analysis was prepared based on data science practices applied to residential energy consumption contexts and aims to facilitate decision-making for the end client.
-
👩🏻🚀 Fabiana Campanari - Shoot me an email
-
🧑🏼🚀 PedroVyctor - Hit me up by email
-
👨🏽🚀 Andson Ribeiro - Slide into my inbox
🛸๋ My Contacts Hub
────────────── ⊹🔭๋ ──────────────
➣➢➤ Back to Top
Copyright 2024 Mindful-AI-Assistants. Code released under the MIT license.