-
Notifications
You must be signed in to change notification settings - Fork 1
Feat: pems_data
package
#187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
a2ae975
to
c341d86
Compare
e259419
to
3ed130e
Compare
774bb4e
to
8a3ce21
Compare
03937a7
to
46be947
Compare
7e174b4
to
9be9a63
Compare
46be947
to
69f95c7
Compare
69f95c7
to
b6191ea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This organization is really nice @thekaveman!
There are maybe just a few things that need tweaking since I am testing these changes (started by running bin/build.sh
) and I'm seeing this so far:
web
built ✅dev
built ✅streamlit
⚠️ ERROR [streamlit app_container 9/9] RUN pip install $(find /wheels -name pems_streamlit*.whl)
ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: '/caltrans/app/pems_data'
- I think the reason is because
streamlit
'spyproject
lists"pems_data @ file:./pems_data"
as a dependency but we have not copiedpems_data
into theapp_container
stage. The fix would be to addCOPY pems_data pems_data
to thestreamlit
Dockerfile - Adding the above gets us over the error and
pems_data
starts getting installed, but then I see this errorProcessing ./pems_data (from pems-streamlit==2024.12.2.dev183+gb6191ea.d20250725)
Getting requirements to build wheel: finished with status 'error'
LookupError: setuptools-scm was unable to detect version for /caltrans/app.
- The fix is to also copy the
.git
directory to theapp_container
stage so that setuptools-scm can see it in the container. This fixes this but then we get this errorcreating src/pems_data.egg-info
error: could not create 'src/pems_data.egg-info': Permission denied
- I'm testing/troubleshooting this point now but wanted to let you know what I'm seeing so far.
@lalver1 yep you are totally right. My bad, I was testing within the devcontainer and didn't run the Streamlit container directly. Working on a fix... |
@lalver1 Your diagnosis was correct:
And then of course you saw the issues with trying to copy the extra files in... The issue happens because for the devcontainer, we want the This commit makes a few changes. It sort of feels like a "hack", but honestly I wanted to try and avoid having to use a tool like
|
@lalver1 I'm considering whether we maybe want to go the other way...
Maybe that feels cleaner? And also, I'm worried about rewriting a file that is part of the source in terms of calculating the version info, since that looks at |
1f9e7d9
to
5246949
Compare
@lalver1 I did go the other way, turns out we don't need to rewrite anything. The direct package dependency reference works in both! |
5246949
to
bf7fa41
Compare
Nice job @thekaveman! 🙌 It's so nice we didn't need to overwrite anything in the Dockerfile. |
- get a fully-formed bucket URL - get a list of object prefixes matching a pattern - read a parquet file into a DataFrame
- metadata and imputed aggregate 5min URLs - get district metadata - get station imputed aggregate 5min data
using a file: dependency on pems_data doesn't work during Docker build time since we 1) don't copy the pems_data source for the install stage, and 2) we don't copy the .git directory either -- nor do we want to copy either of these instead, use a normal package dependency during Docker build and build and download both local packages and all dependencies into a 'wheelhouse' for install update CI to install all 3 packages for tests
bf7fa41
to
ba09852
Compare
376d13f
to
eaa79ce
Compare
create an abstract base class interface for a data source, which represents the "how" of fetching data refactor S3Bucket to S3DataSource
the concept of a Service here is for the "what" of accessing specific data for business logic refactor the StationsBucket to a StationsService StationsService needs an IDataSource to read from (the "how")
eaa79ce
to
db02e6e
Compare
Closes #179
What this PR does
Follows the same general approach and outline from #181 and #184.
New
pems_data
packageCreate a new Python package with its own
pyproject.toml
listing dependencies and other metadata. The code is organized into thesrc
structure similar topems_web
in #181 andpems_streamlit
in #184.Refactor
pems_streamlit
data access#166 originally introduced loading of some key data from S3. This PR refactors that access from out of
pems_streamlit
and intopems_data
in two initial modules:pems_data.s3
contains general S3 access classes for PeMSpems_data.stations
contains station-specific S3 access classes for PeMSpems_data
is introduced as a dependency ofpems_streamlit
, which then uses the refactored classes in the same was as before.Devcontainer
Add
pems_data
to the devcontainer Dockerfile so it is installed (as editable), with its dependencies.Tests
Ensures coverage is reported for the new
pems_data
package. No changes needed to CI sincepems_data
is installed as a dependency ofpems_streamlit
as normal (we don't need editable installs in CI).