AI-Driven Assessment of Trends in Climate Policy

Quick navigation

Background
Data
Models
Timeline
Repo Structure
Logistics
Resources
Contact

Goal

The goal of this project is to develop an AI-powered question-answering system that automatically analyzes Climate Action Plans (CAPs) and other climate adaptation and mitigation documentation. The system will be capable of extracting key data about climate vulnerabilities, planned mitigation measures, and socio-economic and geographic context, providing well-sourced, accurate responses to user queries.

Background

Climate change poses an urgent challenge for cities worldwide, prompting the creation of comprehensive Climate Action Plans (CAPs) to mitigate impacts and adapt to evolving conditions. These plans detail strategies for reducing emissions, addressing vulnerabilities, and protecting populations from climate risks, but their length and complexity make it difficult for city planners, researchers, and policymakers to efficiently extract and compare key information across regions.

This project addresses this by developing an AI-powered question-answering system that automates the extraction of critical information from CAPs. Using Natural Language Processing (NLP) and Machine Learning (ML) techniques, the system analyzes thousands of pages of climate documentation and provides accurate, well-sourced responses to climate-related inquiries, with LangChain facilitating the organization and structuring of extracted data for more efficient analysis.

Data

Climate action Plans can be found under the CAPS folder. External data sources are housed on Box

Timeline

Fall 2024 (September through December 2024 intially)

Repo Structure

This repository contains code for three main components:

Data ingestion and processing portal
Climate Action Plan Tracker

The Climate Action Plan Tracker is hosted on Streamlit Cloud as well as HuggingFace Spaces. It may also be run locally. The data ingestion and processing portal is designed to be run locally only.

The repository also contains batch scripts. Run these scripts in the scenario where no data (Climate Action Plan Summaries, Vectorstores, and Dataset).

Users can run the tools using the following commands:

streamlit run data_ingestion_app.py to run the data ingestion and processing portal

streamlit run app.py to run the tool

`/data` contains all the externald data sources used in the maps tool

`/data_ingestion_helpers` contains the helper functions used in the data ingestion process. Each run of the data ingestion process will do the following:

Save the new Climate Action Plan to the CAPS folder
Collects the metatdata of the city (City, State, County, and City Center Coordinates) and updates the city_county_mapping.csv file
Generates a summary of the Climate Action Plan and stores it in the CAPS_Summaries folder
Creates the vector stores of the Climate Action Plan used in the QA tool (Individual, Summary and Combined Vector Stores)
Queries an LLM to update the climate actions plans dataset in climate_actions_plans.csv
Updates the CAPS plans list in caps_plans.csv
Re-runs the maps_data.py script to update the data powering the maps part of the tool

`/batch_scripts` contains scripts that can be run to batch process CAPs.

batch_summary_generation.py generates summaries for all CAPs in the CAPS folder and saves them in the CAPS_Summaries folder

caps_directory_reader.py reads in the CAPS plans in the CAPS folder and saves the data to a csv file called caps_plans.csv

census_county_data.py reads in the census data and saves the data to a csv file called us_counties.csv which is used by the data ingestion tool

create_vector_stores.py creates the vector stores of the Climate Action Plan used in the QA tool (Individual, Summary and Combined Vector Stores)

dataset_generation.py queries an LLM to create the climate actions plans dataset in climate_actions_plans.csv

In most cases, these batch process files will not need to be run.

`/maps_helpers` contains the helper functions used in the maps tool and stores the data powering the maps tool

To run the tool, in a terminal run streamlit run app.py. Please ensure that all necessary packages have been installed as per the requirements.txt file. Necessary packages can be installed using pip: pip install -r requirements.txt

The Prompts folder contains all the system prompt templates used in the tool. These can be modified to modify the behavior of the tools.

Project logistics

Sprint planning: Every Monday at 10-10:30am on Zoom.

Backlog Grooming: NA / as needed.

Sprint Restrospective: Every Friday 1:30-2pm on Zoom.

Demos: Every Friday at 3pm on Zoom as well as in person at the DSI.

Data location: Climate Policy Data

Slack channel: climate-policy on Data Science TIP slack organization. Please check your email for an invite.

Resources

Provide any useful resources to get readers up to speed with the project here.

LangChain: Please see LangChain Tutorials
Python usage: Whirlwind Tour of Python, Jake VanderPlas (Book, Notebooks)
Data science packages in Python: Python Data Science Handbook, Jake VanderPlas
HuggingFace: Website, Course/Training, Inference using pipelines, Fine tuning models
fast.ai: Course, Quick start
h2o: Resources, documentation, and API links
nbdev: Overview, Tutorial
Git tutorials: Simple Guide, Learn Git Branching
ACCRE how-to guides: DSI How-tos

Contact Info

Project Lead: Umang Chaudhry, Senior Data Scientist, Vanderbilt Data Science Institute
PI: Dr. JB Ruhl, David Daniels Allen Distinguished Chair in Law, Vanderbilt University Law School
Project Manager: Isabella Urquia
Team Members: Ethan Thorpe, Mariah Caballero, Harmony Wang, Xuanxuan Chen, Aparna Lakshmi

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
.devcontainer		.devcontainer
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
CAPS		CAPS
CAPS_Summaries		CAPS_Summaries
Combined_By_Region_Vectorstores		Combined_By_Region_Vectorstores
Combined_Summary_Vectorstore		Combined_Summary_Vectorstore
Individual_All_Vectorstores		Individual_All_Vectorstores
Individual_Summary_Vectorstores		Individual_Summary_Vectorstores
Prompts		Prompts
batch_scripts		batch_scripts
data		data
data_ingestion_helpers		data_ingestion_helpers
maps_helpers		maps_helpers
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
app_helpers.py		app_helpers.py
caps_plans.csv		caps_plans.csv
city_county_mapping.csv		city_county_mapping.csv
climate_action_plans_dataset.csv		climate_action_plans_dataset.csv
data_ingestion_app.py		data_ingestion_app.py
epa_regions.csv		epa_regions.csv
region_vectorstores.py		region_vectorstores.py
requirements.txt		requirements.txt
us_counties.csv		us_counties.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI-Driven Assessment of Trends in Climate Policy

Quick navigation

Goal

Background

Data

Timeline

Repo Structure

`/data` contains all the externald data sources used in the maps tool

`/data_ingestion_helpers` contains the helper functions used in the data ingestion process. Each run of the data ingestion process will do the following:

`/batch_scripts` contains scripts that can be run to batch process CAPs.

`/maps_helpers` contains the helper functions used in the maps tool and stores the data powering the maps tool

Project logistics

Resources

Contact Info

About

Uh oh!

Uh oh!

Contributors 3

Uh oh!

Languages

License

vanderbilt-data-science/climate-policy

Folders and files

Latest commit

History

Repository files navigation

AI-Driven Assessment of Trends in Climate Policy

Quick navigation

Goal

Background

Data

Timeline

Repo Structure

/data contains all the externald data sources used in the maps tool

/data_ingestion_helpers contains the helper functions used in the data ingestion process. Each run of the data ingestion process will do the following:

/batch_scripts contains scripts that can be run to batch process CAPs.

/maps_helpers contains the helper functions used in the maps tool and stores the data powering the maps tool

Project logistics

Resources

Contact Info

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 3

Uh oh!

Languages

`/data` contains all the externald data sources used in the maps tool

`/data_ingestion_helpers` contains the helper functions used in the data ingestion process. Each run of the data ingestion process will do the following:

`/batch_scripts` contains scripts that can be run to batch process CAPs.

`/maps_helpers` contains the helper functions used in the maps tool and stores the data powering the maps tool