Skip to content

EvoEquation: An AlphaEvolve-inspired project using evolutionary symbolic regression to automatically discover mathematical laws governing global carbon intensity from real-world energy data.

Notifications You must be signed in to change notification settings

jeffasante/EvoEquation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

EvoEquation: Discovering Mathematical Laws in Energy Data

Inspired by the power of evolutionary search (like in systems such as AlphaEvolve) to find non-obvious solutions, EvoEquation asks: What if we don't assume the form of the solution and let evolution discover it for complex, real-world problems?

This project explores the application of a miniature evolutionary algorithm for symbolic regression to uncover unknown mathematical relationships within complex global energy data. Instead of pre-supposing a model structure (e.g., linear, polynomial), EvoEquation attempts to evolve mathematical expressions that best describe the observed phenomena.

The Challenge: Understanding Global Carbon Intensity

One of the most pressing global challenges is understanding and mitigating carbon emissions. The carbon intensity of electricity generation (carbon_intensity_elec) is a key metric, influenced by a multitude of interacting factors like economic activity (GDP), population dynamics, and the share of different energy sources (coal, renewables, nuclear, gas).

Traditional modeling often requires making assumptions about how these factors relate to carbon intensity. EvoEquation takes a different path.

The Approach: Evolutionary Symbolic Regression

EvoEquation uses an evolutionary algorithm where:

  • Individuals are mathematical expressions (trees of operators, variables, and constants).
  • Fitness is determined by how well an expression predicts carbon_intensity_elec based on features like gdp, population, coal_share_energy, renewables_share_energy, etc., using real-world data (from Our World In Data).
  • Evolution proceeds through generations, using selection, crossover, and mutation to refine expressions, penalizing complexity to favor more interpretable solutions.

A Glimpse of Discovery: Towards a Mathematical Law

Using real data from 20 countries over 22 years (2000-2021), EvoEquation has yielded intriguing results. The algorithm discovered that coal dominance drives carbon intensity through exponential relationships.

Example Discovery Session

(base) ➜ genetic-symbolic-regression python energy_discovery.py              
EvoEquation: Real Energy Data Discovery
============================================================
Loading energy data from owid-energy-data.csv
   Loaded 21812 rows, 130 columns
   Filtered to 20 countries: 2314 rows
   Filtered to years 2000-2021: 440 rows
   Removed 0 rows with missing target values (carbon_intensity_elec)
   Final dataset after numeric conversion and final NA drop: 440 rows

Dataset Analysis:
Target variable: carbon_intensity_elec
Feature variables: gdp, population, renewables_share_energy, coal_share_energy, 
                  nuclear_share_energy, gas_share_energy, energy_per_capita
Number of unique countries: 20
Year range: 2000 - 2021

Target Variable (carbon_intensity_elec):
   Range: 52.15 - 813.48
   Mean: 464.95
   Std: 207.65

Correlations with carbon_intensity_elec:
   renewables_share_energy: -0.551
   coal_share_energy: 0.710
   nuclear_share_energy: -0.448
   
Starting evolutionary discovery...
Training on 352 samples, testing on 88 samples

Generation 0: Best Fitness = 2.98
  Variables: energy_per_capita
  Equation: sqrt(energy_per_capita)

Generation 10: Best Fitness = 2.23
  Variables: renewables_share_energy, energy_per_capita, nuclear_share_energy
  Equation: (energy_per_capita ^ cos(exp(gdp_intensity(...))))

Generation 40: Best Fitness = 1.99
  Variables: energy_per_capita
  Equation: (energy_per_capita ^ saturation_curve(0.25))

============================================================
DISCOVERY RESULTS
============================================================
Discovered Equation for carbon_intensity_elec:
   (energy_per_capita ^ saturation_curve(0.25))

Variables Utilized: energy_per_capita
Training Fitness: 1.99 | Test Fitness: 2.72
Generalization Ratio: 1.37 (Good generalization)

Advanced Discovery Example

One of the most promising equations discovered revealed a strong, non-linear dependence on coal share, alongside contributions from population and energy consumption:

carbon_intensity = ((coal_share_energy + sqrt(exp(coal_share_energy))) + 
                   (coal_share_energy + sqrt(exp(energy_per_capita))) + 
                   (coal_share_energy + sqrt(exp(population))))

Discovery Results

Key Discoveries:

  • Coal appears 4x in the equation - showing overwhelming impact on carbon intensity
  • Exponential terms (sqrt(exp(variable))exp(variable/2)) capture non-linear compounding effects
  • Coal reduction shows exponential benefits - a potentially significant insight for climate policy
  • Explains 29.3% of variance across diverse countries (US, China, Germany, Nigeria, Ghana, etc.)

This demonstrates the potential of evolutionary approaches to automatically discover candidate mathematical laws governing complex systems like global carbon intensity, directly from data.

The AlphaEvolve Connection

Like AlphaEvolve discovered new algorithms by evolving code, EvoEquation discovers new scientific relationships by evolving mathematical expressions:

System Evolves Breakthrough
AlphaEvolve Algorithms 48-multiplication matrix algorithm
EvoEquation Equations Coal-dominance carbon intensity law

Both use the same core principle: evolutionary search through solution spaces to find patterns humans haven't explicitly programmed.

Results That Matter

Performance:

  • Low training error (fitness: 0.67), indicating good model fit to training data
  • Moderate positive correlation (0.559) with actual values on test data
  • Explains 29.3% of variance (R²=0.293) on unseen test data
  • Good generalization across diverse countries (US, China, Germany, Nigeria, Ghana, etc.)

Policy Implications:

  • Quantifies exponential benefits of coal reduction for carbon intensity
  • Reveals population-energy scaling laws across development stages
  • Offers quantitative insights that could contribute to shaping significant climate policy decisions

How to Run

  1. Download the energy dataset:

    wget https://github.com/owid/energy-data/raw/master/owid-energy-data.csv
  2. Install dependencies:

    pip install pandas numpy matplotlib seaborn
  3. Run the discovery:

    python energy_discovery.py

Files

  • energy_discovery.py - Main evolutionary algorithm for energy data
  • owid-energy-data.csv - Global energy dataset (Our World in Data)
  • results/evolution_results.png - Visualization of algorithm performance

The Bigger Picture

This demonstrates that evolutionary algorithms can discover genuine scientific relationships in complex real-world data. The same principles that let AlphaEvolve revolutionize algorithm design can uncover mathematical laws governing climate, economics, biology, and beyond.

The question isn't whether evolutionary discovery works - it's what other scientific breakthroughs are waiting to be evolved.

References & Inspiration


I'm just learning dont come for me LOL

About

EvoEquation: An AlphaEvolve-inspired project using evolutionary symbolic regression to automatically discover mathematical laws governing global carbon intensity from real-world energy data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages