You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ecosystem science has many components, so does PEcAn! Some of those components where you can contribute. Below is a list of potential ideas. Feel free to contact any of the mentors in slack, or feel free to ask questions in our #gsoc-2021 channel in slack.
79
+
Ecosystem science has many components, so does PEcAn! Some of those components where you can contribute. Below is a list of potential ideas. Feel free to contact any of the mentors in slack, or feel free to ask questions in our #gsoc-2023 channel in slack.
PEcAn is implemented as a set of R packages, but the user must currently download and install all the packages as a single unit. The short-term goal of this project is to focus on fixing warnings in the build process, refactoring to remove unnecessary dependencies, and potentially splitting modules. The medium-term goal is to increase the reliability of PEcAn’s integration tests, and thus this year’s package development will prioritize the packages that are most associated with overall workflow bottlenecks (e.g., PEcAn.data.atmosphere, which is focused on downloading and processing meteorological data). The longer-term goal is to make PEcAn packages available on CRAN (the primary R package archive) which will not only make it easier to install, but also easier to find and easier to use standalone modules.
97
93
<p> </p>
98
94
99
95
<dl>
100
96
<dt>Expected outcome:</dt>
101
-
<dd>PEcAn packages available in CRAN.</dd>
97
+
<dd>PEcAn packages pass checks and integration tests without warnings. Packages are made available in CRAN.</dd>
102
98
<dt>Prerequisites:</dt>
103
-
<dd>R and comfort with the key steps required to release a package on CRAN; experience with R packages
104
-
helpful, but most of the process is covered in chapters on R package releases in the book
105
-
‘rOpenSci packages’ and the book ‘R packages’ by Hadley Wickham</dd>
99
+
<dd>R; experience with R packages is helpful, but most of the process is covered in chapters on R package releases in the book ‘rOpenSci packages’ and the book ‘R packages’ by Hadley Wickham</dd>
106
100
<dt>Contact person:</dt>
107
-
<dd>Chris Black, @infotroph</dd>
101
+
<dd>Chris Black, @infotroph; Mike @Dietze</dd>
108
102
<dt>Duration:</dt>
109
103
<dd>Size: 175 hours for proposals that focus on dependency removal, 350 hours for proposals that split modules.</dd>
110
104
<dt>Difficulty:</dt>
111
-
<dd>Easy.</dd>
105
+
<dd>Easy, we anticipate the ability for multiple people to work on this project since different individuals can focus on different PEcAn R packages.</dd>
112
106
</dl>
113
107
114
108
<hr/>
115
109
116
-
<h4><aname="pecan.ma">Submit PEcAn.MA to CRAN [Data Science]</h4>
117
-
118
-
The PEcAn meta analysis package currently queries plant trait data stored in the BETYdb Postgres database
119
-
and uses meta-analysis to estimate parameters for ecosystem models. It also stores information about the
120
-
meta-analysis in the database. This project would decouple the PEcAn.MA package from the BETYdb database
121
-
in order to make it more modular and portable. It would replace dependency on the database with text files
122
-
as inputs and outputs. These text files could optionally be read from and inserted back into the database.
123
-
124
-
<p> </p>
125
-
126
-
<dl>
127
-
<dt>Expected outcome:</dt>
128
-
<dd>The PEcAn.MA package submitted to CRAN, without dependency on a running database. Additional functions
129
-
in the PEcAn.DB package responsible for generating and reading text files from and into the database.
130
-
<dt>Prerequisites:</dt>
131
-
<dd>R and SQL, plus package development as described in the PEcAn packages on CRAN project.
132
-
<dt>Contact person:</dt>
133
-
<dd>David @dlebauer, Kristina</dd>
134
-
<dt>Duration:</dt>
135
-
<dd>Large (350hr)</dd>
136
-
<dt>Difficulty:</dt>
137
-
<dd>Hard some knowledge of how the meta analysis package works is needed for this</dd>
One of the goals of PEcAn is to be able to run different ecological models (which require a range of data inputs)
145
-
and compare the model outputs with actual measurements (a.k.a. data constraints). The goal of this project is twofold,
146
-
depending on the specific interests of the GSOC student.
114
+
One of the goals of PEcAn is to be able to run different ecological models (which require a range of data inputs) and compare the model outputs with actual measurements (a.k.a. data constraints). The goal of this project is twofold, depending on the specific interests of the GSOC student.
147
115
<ol>
148
-
<li>The current PEcAn input processing occurs mostly within the primary runtime workflow, but numerous PEcAn
149
-
applications would benefit from the ability to update near real-time data asynchronously with model execution,
150
-
handling different data streams in parallel. As part of this we’d also like to make it easier to use PEcAn
151
-
input processing modules as stand alone tools.</li>
116
+
<li>The current PEcAn input processing occurs mostly within the primary runtime workflow, but numerous PEcAn applications would benefit from the ability to update near real-time data asynchronously with model execution, handling different data streams in parallel. As part of this we’d also like to make it easier to use PEcAn input processing modules as stand alone tools. This subproject also leverages a joint effort with the Red Hat Collaboratory.</li>
152
117
<li>Increase the number of input products supported. Students may focus on one or more of the following:
153
118
<oltype="a">
154
-
<li>add the ECMWF Open Data as an meteorological drivers</li>
155
-
<li>create a common pipeline for ingesting agricultural management data using ICASA standards and json file
156
-
formats (see https://github.com/PecanProject/pecan/issues/2518)</li>
157
-
<li>Extend our existing support for ingesting data from the National Ecological Observatory Network (NEON)
158
-
and Ameriflux as both data inputs and constraints.</li>
119
+
<li>Add the NMME (seasonal weather forecast) as an meteorological drivers</li>
120
+
<li>Add remote sensing data streams: NASA GEDI (lidar), solar induced fluorescence (e.g., NASA OCO-2, OCO-3), thermal (e.g., NASA ECOSTRESS)
121
+
</li>
122
+
<li>Extend our existing support for ingesting data from the National Ecological Observatory Network (NEON) soil moisture and soil respiration data products. This will involve developing integrating NEONSoils code into PEcAn https://github.com/jmzobitz/NEONSoils and internal code from the Dietze lab on soil moisture gap-filling and downscaling.</li>
159
123
</ol>
160
124
</li>
161
125
</ol>
162
-
126
+
We anticipate the ability for multiple people doing this project since there are separate parts that can be done by individuals.
163
127
<p> </p>
164
128
165
129
<dl>
166
130
<dt>Prerequisites:</dt>
167
131
<dd>R.</dd>
168
132
<dt>Contact person:</dt>
169
-
<dd>@Alexis Helgeson (1, 2c), @HenriKajasilta (2a,b), Istem Fer @istfer (2a,b), David LeBauer @dlebauer (2b).</dd>
133
+
<dd>@Alexis Helgeson, @Ankur Desai, Istem Fer @istfer</dd>
170
134
<dt>Duration:</dt>
171
-
<dd>1 data update [size: large (350hr), 2.a ECMWF [size: small (175 hr), difficulty: easy], 2.b Management standards [size: large (350 hr), difficulty: medium] 2.c Neon [size: small (175hr)]</dd>
135
+
<dd>1. data workflow update [size: large (350hr)]; 2. Individual data packages: [size: small (175 hr) for one, large for 2-3 data packages]</dd>
<dd>1 data update [difficulty: hard]; 2. Individual data packages: 2.1 easy, 2.2 easy, 2.3 medium</dd>
174
138
</dl>
175
139
176
140
<hr/>
177
141
178
-
<h4><aname="api">Extend API / Distributed file sharing [Computer Science]</h4>
179
-
180
-
Last year we have started to build an API for PEcAn. This was a enormous success, and the scientists loved this approach. We would like to expand on this API and have more functionality available through the API.
142
+
<h4><aname="gha">GitHub Actions</h4>
181
143
144
+
Currently GitHub Actions will check to see if there are newer versions of the packages installed. We need to limit these checks since they are limited by GitHub. Additionally we do a simple test of SIPNET, it would be great if that can use the full docker stack to test a full run.
182
145
<p> </p>
183
146
184
-
<dl>
185
-
<dt>Expected outcome:</dt>
186
-
<dd>More functions available through the API, especially options to query the database./dd>
187
-
<dt>Prerequisites:</dt>
188
-
<dd>Knowledge of R and Rest</dd>
189
-
<dt>Contact person:</dt>
190
-
<dd>Rob Kooper @kooper</dd>
191
-
<dt>Duration:</dt>
192
-
<dd>Depending on the number of API calls added this can be both small (175hr) and large (350hr) project</dd>
There is a helm chart that will load the PEcAn in kubernetes. This would expand on this helm chart to add autoscaling,
202
-
as well as taking the PEcAn executor container and splitting it up in smaller pieces.
203
-
147
+
In the past year we have created a dashboard that shows how tests are performing. It would be great to have a test that runs the tests using the develop stack and writes the test results back into a file in a special branch. As part of this task the dashboard will need to be updated to fetch the data from this branch.
204
148
<p> </p>
205
149
206
150
<dl>
207
151
<dt>Expected outcome:</dt>
208
-
<dd>A helm chart that will install PEcAn in kubernetes and scales the models up and down as needed.</dd>
152
+
<dd>New GitHub actions that do not take as long to run, and have the ability to do larger tests./dd>
209
153
<dt>Prerequisites:</dt>
210
-
<dd>R, Docker, and kubernetes.</dd>
154
+
<dd>GitHub Actions, Docker</dd>
211
155
<dt>Contact person:</dt>
212
-
<dd>Rob Kooper, @kooper</dd>
156
+
<dd>Rob Kooper @kooper</dd>
213
157
<dt>Duration:</dt>
214
-
<dd>Small (175hr), adding more features can grow this to large (350hr)</dd>
158
+
<dd>Flexible to work as either a Small (175hr) or Large (350 hr)</dd>
215
159
<dt>Difficulty:</dt>
216
-
<dd>Easy</dd>
160
+
<dd>Medium, Large if running and updating the integration testing dashboard</dd>
The ability to partition the contributions of different model parameters to a model’s predictive uncertainty has long been a core feature of PEcAn. This task extends the current uncertainty analysis to include model drivers (e.g. meteorology), initial conditions, and process error using a Sobol-based approach. Note that uncertainties in these inputs have been worked out and implemented in PEcAn already, the focus is on implementing and running the Sobol analysis. A secondary goal is to research the file formats and data structures used by other ensemble-based packages and tools so as to make PEcAn more interoperable.
223
-
224
-
<p> </p>
225
-
226
-
<dl>
227
-
<dt>Expected outcome:</dt>
228
-
<dd>Primary - New Sobol functions within the PEcAn.uncertainty module. Outputs from those functions for provided inputs.
229
-
Secondary - Summary report on proposed ensemble data structures and file formats. Implementation of proposal if time permits.</dd>
230
-
<dt>Prerequisites:</dt>
231
-
<dd>R required, experience in statistics preferred</dd>
232
-
<dt>Contact person:</dt>
233
-
<dd>Mike @Dietze, @Alexis Helgeson</dd>
234
-
<dt>Duration:</dt>
235
-
<dd>Small (175 hr) for primary alone, Large (350) for both primary and secondary goals.</dd>
236
-
<dt>Difficulty:</dt>
237
-
<dd>Understanding current system - Medium; Implementing new components once you understand that system - Easy.</dd>
Last year’s GSOC students developed an integration testing framework for PEcAn and a web-based <ahref="http://141.142.220.191/statusboard/">"status board"</a>
245
-
where we can see what models, inputs, etc are currently working and which are down. The goals this year are:
246
-
<ol>
247
-
<li>to extend the set of integration tests to a wider suite of models and inputs</li>
248
-
<li>to analyze the status board to identify key bottlenecks and failure points</li>
249
-
<li>to refactor those failure points to increase overall workflow reliability.</li>
250
-
</ol>
168
+
This project is primarily focused on the interactive visualization of outputs from our carbon cycle forecast and data assimilation system. This project builds on a previously-developed site-level R Shiny dashboard that is no longer functional, and aims to extend this to a much larger number of sites. We also hope to integrate in functionality from one of our other dashboards (which visualizes spatial interactions) and advances made by external collaborators. If time permits, we’d also like to resurrect our automated email alert system.
251
169
252
170
<p> </p>
253
171
254
172
<dl>
255
173
<dt>Expected outcome:</dt>
256
-
<dd>Larger number of integration tests and higher percentage of successful tests (>75% as a Small project, >90% as a Large project)</dd>
174
+
<dd>The aims here are:
175
+
<ol>
176
+
<li>Resurrect a previously-developed R Shiny dashboard for our carbon cycle forecast system (pecan/shiny/ForecastingDashboard), potentially integrating in work done by the Ecological Forecasting Initiative on their dashboard (https://github.com/eco4cast/neon4cast-dashboard) and FMI’s Field Observatory (https://www.fieldobservatory.org/en/home/)
177
+
</li>
178
+
<li>Merge in the functionality from our data assimilation dashboard (pecan/shiny/SDAdashboard)
179
+
</li>
180
+
<li>Resurrect the automated email alert system that sent a subset of visualizations, and links to the full app, to users for the sites they are interested in.</li>
181
+
</ol></dd>
257
182
<dt>Prerequisites:</dt>
258
-
<dd>R, Github Actions</dd>
183
+
<dd>R, R Shiny, data visualization</dd>
259
184
<dt>Contact person:</dt>
260
-
<dd>Mike @Dietze, Chris Black @infotroph</dd>
185
+
<dd>Mike @Dietze, @HenriKajasilta</dd>
261
186
<dt>Duration:</dt>
262
187
<dd>Flexible to work as either a Small (175hr) or Large (350 hr)</dd>
0 commit comments