Skip to content

Commit 6323633

Browse files
committed
added database project
1 parent 9fc6daa commit 6323633

File tree

1 file changed

+60
-13
lines changed

1 file changed

+60
-13
lines changed

src/pages/gsoc_ideas.mdx

Lines changed: 60 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,13 @@ title: 'GSoC 2025 - PEcAn Project Ideas'
44

55
# [GSoC - PEcAn Project Ideas](#background)
66

7-
Ecosystem science has many components, so does PEcAn! Some of those components where you can contribute. Below is a list of potential ideas. Feel free to contact any of the mentors in slack, or feel free to ask questions in our #gsoc-2025 channel in slack.
7+
PEcAn is an open-source ecosystem modeling framework integrating data, models, and uncertainty quantification. Below is a list of potential ideas where contributors can help improve and expand PEcAn. Come find us on Slack to discuss. If you have questions or would like to propose your own idea, contact @kooper in Slack or join our `#gsoc-2025`
88

99
---
1010

1111
## [Project Ideas](#ideas)
1212

13-
Following is a list of project ideas, use this list to contact the appropriate mentors on slack. Feel free to propose your own ideas as well, in this case contact @kooper in Slack so he can put you in contact with the best mentors.
13+
Below is a list of project ideas. Feel free to contact the listed mentors on Slack to discuss further or contact @kooper with new ideas and he can help connect you with mentors.
1414

1515
---
1616

@@ -21,9 +21,9 @@ This project would extend PEcAn's existing uncertainty partitioning routines, wh
2121

2222
**Expected outcomes:**
2323

24-
A successful project would complete at subset of the following tasks:
24+
A successful project would complete a subset of the following tasks:
2525

26-
* Reliable, automated Sensitivity analyss and uncertainty partitioning
26+
* Reliable, automated Sobol sensitivity analyss and uncertainty partitioning across multiple model inputs.
2727
* Applications to test case(s) in natural and / or managed ecosystems.
2828

2929
**Prerequisites:**
@@ -45,9 +45,9 @@ Medium
4545

4646
---
4747

48-
#### [Parallelization of runs](#hpc)
48+
#### [Parallelization of Model Runs on HPC](#hpc)
4949

50-
This project would extend PEcAn's existing run mechanisms to be able to run on an HPC using apptainer. For uncertaintity analysis, PEcAn will run 1000s of runs of the same model with small permutations. This is a perfect use for an HPC run. The goal is to not submit 1000s of jobs, but have a single job with multiple nodes that will run all of the ensembles efficiently. Running can be orchistrated using RabbitMQ but other methods are encouraged as well. The end goal should be for the PEcAn system to be launched, and run the full workflow on the HPC from start to finish leveraging as many nodes as given during the submission.
50+
This project would extend PEcAn's existing run mechanisms to be able to run on a High Performance Compute cluster (HPC) using [Apptainer](https://apptainer.org). For uncertaintity analysis, PEcAn will run the same model 1000s of times with small permutations. This is a perfect use for an HPC run. The goal is to not submit 1000s of jobs, but have a single job with multiple nodes that will run all of the ensembles efficiently. Running can be orchistrated using RabbitMQ, but other methods are also encouraged. The end goal should be for the PEcAn system to be launched, and run the full workflow on the HPC from start to finish leveraging as many nodes as it is given during the submission.
5151

5252
**Expected outcomes:**
5353

@@ -58,8 +58,8 @@ A successful project would complete at subset of the following tasks:
5858

5959
**Prerequisites:**
6060

61-
- Required: R (existing workflow and prototype is in R), docker
62-
- Helpful: familiarity with HPC and apptain
61+
- Required: R (existing workflow and prototype is in R), Docker
62+
- Helpful: Familiarity with HPC and Apptainer
6363

6464
**Contact person:**
6565

@@ -74,23 +74,36 @@ Flexible to work as either a Medium (175hr) or Large (350 hr)
7474
Medium
7575

7676
---
77-
#### [Database Improvements](#db)
77+
#### [Database and Data Improvements](#db)
78+
79+
PEcAn relies the BETYdb database to store trait and yield data as well as model provenance information. This project aims separating trait data from provenance tracking, and ensure that PEcAn is aboe to run without the Postgres server currently required to run BETYdb. The goal is to making the workflows easier to use and data more accessible.
80+
81+
82+
**Potential Directions**
83+
84+
- **Minimal BETYdb Database:** Create a simplified version of BETYdb for demonstrations and Integration tests.
85+
- **Non-Database Setup:** Enable workflows that do not require PostgreSQL or a web front-end.
7886

79-
**Chris TODO**
80-
- decouple traits from provenance
81-
- make betydb.org data available through R package
87+
**Expected outcomes**:
88+
89+
A successful project would complete a subset of the following tasks:
90+
- A lightweight, distributable demo Postgres database.
91+
- A Postgres database independent workflow enabling easier local testing and deployment.
8292

8393

8494

8595
**Contact person:**
96+
8697
Chris Black (@infotroph)
8798

8899
**Duration:**
89-
Flexible to work as either a Medium (175hr) or Large (350 hr)
100+
101+
Suitable fora Medium (175hr) or Large (350 hr) project.
90102

91103
**Difficulty:**
92104
Medium, Large
93105

106+
94107
---
95108

96109
#### [Development of Notebook-based PEcAn Workflows](#notebook)
@@ -123,6 +136,40 @@ Medium
123136

124137
# This comment section for ideas that may be potentially viable in future (with revision)
125138

139+
140+
#### BETYdb R data package
141+
142+
BETYdb's web front end is built on a version of Ruby on Rails that is functional byt no longer supported. A key feature of BETYdb is that the data is open and accessible.
143+
144+
Building an R data package would make the Trait and Yield data currently in BETYdb more accessible to users beyond the PEcAn community.
145+
146+
**Expected outcomes:**
147+
148+
A successful project would complete a subset of the following tasks:
149+
150+
- An R package containing the data currently hosted in BETYdb.
151+
- Documentation and examples of use.
152+
- Updates to BETYdb documentation.
153+
154+
**Prerequisites:**
155+
156+
- Required: R
157+
- Helpful: R package development; familiarity with relational databases and SQL.
158+
159+
**Contact person:**
160+
161+
David LeBauer (@dlebauer)
162+
163+
**Duration:**
164+
165+
Medium (175hr) to Large (350hr) depending on scope of proposal.
166+
167+
**Difficulty:**
168+
169+
Medium
170+
171+
---
172+
126173
#### [Optimize PEcAn for freestanding use of single packages [R package development]](#freestanding)
127174

128175
PEcAn was designed as a system of independent modules, each implemented as its own R package that was intended to be usable either standalone or as part of the full PEcAn system. Subsequent development focused on the most common cross-module workflows has lead to tighter coupling between modules than was originally intended, so that in practice many of the modules are now challenging to use, test, or develop without a full understanding of their interdependencies. Further, some packages expect inputs and outputs in data structures that are only generated by other PEcAn packages but might be more easily provided in standard well-known formats. We seek proposals to re-loosen these couplings by revisiting the design and interface of PEcAn packages through one or more of:

0 commit comments

Comments
 (0)