-
Notifications
You must be signed in to change notification settings - Fork 57
fix OpenAPI Context #524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
fix OpenAPI Context #524
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
📝 WalkthroughWalkthroughThe changes update the codebase to use a new Bohrium SDK module structure, replacing previous imports and usage patterns. Credential handling is now explicit, with required keys extracted from configuration or environment variables. Custom utility functions for zipping and unzipping files are introduced, and new dependencies are added to the project configuration. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant OpenAPIContext
participant Bohrium
participant Tiefblue
User->>OpenAPIContext: Initialize with local_root, remote_profile
OpenAPIContext->>OpenAPIContext: Extract credentials from remote_profile/env
OpenAPIContext->>Bohrium: Initialize client with credentials
OpenAPIContext->>Tiefblue: Initialize storage with client
User->>OpenAPIContext: upload_job(job)
OpenAPIContext->>OpenAPIContext: zip_file_list()
OpenAPIContext->>Tiefblue: Upload zipped job files
User->>OpenAPIContext: download(submission)
OpenAPIContext->>Tiefblue: Download result archive
OpenAPIContext->>OpenAPIContext: unzip_file()
Estimated code review effort
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (1)
dpdispatcher/contexts/openapi_context.py (1)
76-101
: Consider refactoring credential retrieval logic to reduce duplication.The credential retrieval and validation logic is duplicated between
openapi.py
andopenapi_context.py
.Consider creating a utility function to handle credential retrieval:
# In a shared utility module def get_bohrium_credentials(remote_profile): """Retrieve Bohrium credentials from profile or environment.""" access_key = ( remote_profile.get("access_key", None) or os.getenv("BOHRIUM_ACCESS_KEY", None) or os.getenv("ACCESS_KEY", None) ) project_id = ( remote_profile.get("project_id", None) or os.getenv("BOHRIUM_PROJECT_ID", None) or os.getenv("PROJECT_ID", None) ) app_key = ( remote_profile.get("app_key", None) or os.getenv("BOHRIUM_APP_KEY", None) or os.getenv("APP_KEY", None) ) if access_key is None: raise ValueError( "remote_profile must contain 'access_key' or set environment variable 'BOHRIUM_ACCESS_KEY'" ) if project_id is None: raise ValueError( "remote_profile must contain 'project_id' or set environment variable 'BOHRIUM_PROJECT_ID'" ) return access_key, project_id, app_key
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
dpdispatcher/contexts/openapi_context.py
(5 hunks)dpdispatcher/machines/openapi.py
(4 hunks)pyproject.toml
(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
dpdispatcher/machines/openapi.py (5)
dpdispatcher/utils/utils.py (1)
customized_script_header_template
(202-207)dpdispatcher/machines/dp_cloud_server.py (1)
Bohrium
(19-299)dpdispatcher/submission.py (1)
Job
(725-939)dpdispatcher/contexts/openapi_context.py (1)
unzip_file
(29-32)dpdispatcher/utils/dpcloudserver/zip_file.py (1)
unzip_file
(80-83)
🔇 Additional comments (1)
dpdispatcher/machines/openapi.py (1)
44-70
: Well-implemented credential handling with proper validation.The explicit credential retrieval with multiple fallback sources and validation is a good security practice. The error messages are clear and helpful.
DP_CLOUD_SERVER_HOME_DIR = os.path.join( | ||
os.path.expanduser("~"), ".dpdispatcher/", "dp_cloud_server/" | ||
) | ||
os.makedirs(DP_CLOUD_SERVER_HOME_DIR, exist_ok=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid creating directories at module import time.
Creating directories during module import can cause issues in restricted environments. Consider creating the directory lazily when first needed.
Move the directory creation to methods that actually use it, or create a helper function:
def ensure_home_dir_exists():
os.makedirs(DP_CLOUD_SERVER_HOME_DIR, exist_ok=True)
Then call this function in methods like write_home_file
and read_home_file
.
🤖 Prompt for AI Agents
In dpdispatcher/contexts/openapi_context.py around lines 23 to 26, the directory
is created at module import time, which can cause issues in restricted
environments. Remove the os.makedirs call from the module level and instead
create a helper function that ensures the directory exists. Call this helper
function inside methods that use the directory, such as write_home_file and
read_home_file, to create the directory lazily when first needed.
def unzip_file(zip_file, out_dir="./"): | ||
obj = ZipFile(zip_file, "r") | ||
for item in obj.namelist(): | ||
obj.extract(item, out_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Eliminate code duplication by creating a shared utility module.
The unzip_file
function is duplicated in both openapi.py
and openapi_context.py
. This violates the DRY principle.
Create a shared utility module (e.g., dpdispatcher/utils/zip_utils.py
) and move both unzip_file
and zip_file_list
functions there:
# dpdispatcher/utils/zip_utils.py
from zipfile import ZipFile
import glob
import os
from dpdispatcher.dlog import dlog
def unzip_file(zip_file, out_dir="./"):
"""Extract zip file to specified directory."""
try:
with ZipFile(zip_file, "r") as obj:
for item in obj.namelist():
obj.extract(item, out_dir)
except Exception as e:
dlog.error(f"Failed to unzip file {zip_file}: {str(e)}")
raise
def zip_file_list(root_path, zip_filename, file_list=[]):
"""Create zip file from list of files."""
# ... existing implementation with error handling ...
Then import and use these functions in both files.
🤖 Prompt for AI Agents
In dpdispatcher/contexts/openapi_context.py around lines 29 to 32, the
unzip_file function duplicates code found in openapi.py, violating the DRY
principle. To fix this, create a new shared utility module at
dpdispatcher/utils/zip_utils.py, move both unzip_file and zip_file_list
functions there with proper error handling and logging, then update both
openapi_context.py and openapi.py to import and use these functions from the new
utility module.
def zip_file_list(root_path, zip_filename, file_list=[]): | ||
out_zip_file = os.path.join(root_path, zip_filename) | ||
# print('debug: file_list', file_list) | ||
zip_obj = ZipFile(out_zip_file, "w") | ||
for f in file_list: | ||
matched_files = os.path.join(root_path, f) | ||
for ii in glob.glob(matched_files): | ||
# print('debug: matched_files:ii', ii) | ||
if os.path.isdir(ii): | ||
arcname = os.path.relpath(ii, start=root_path) | ||
zip_obj.write(ii, arcname) | ||
for root, dirs, files in os.walk(ii): | ||
for file in files: | ||
filename = os.path.join(root, file) | ||
arcname = os.path.relpath(filename, start=root_path) | ||
# print('debug: filename:arcname:root_path', filename, arcname, root_path) | ||
zip_obj.write(filename, arcname) | ||
else: | ||
arcname = os.path.relpath(ii, start=root_path) | ||
zip_obj.write(ii, arcname) | ||
zip_obj.close() | ||
return out_zip_file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix mutable default argument and add error handling.
The function has several issues that need addressing:
- Mutable default argument can cause unexpected behavior
- Debug print statements should be removed
- Missing error handling
- Missing context manager for ZipFile
-def zip_file_list(root_path, zip_filename, file_list=[]):
+def zip_file_list(root_path, zip_filename, file_list=None):
+ if file_list is None:
+ file_list = []
out_zip_file = os.path.join(root_path, zip_filename)
- # print('debug: file_list', file_list)
- zip_obj = ZipFile(out_zip_file, "w")
- for f in file_list:
- matched_files = os.path.join(root_path, f)
- for ii in glob.glob(matched_files):
- # print('debug: matched_files:ii', ii)
- if os.path.isdir(ii):
- arcname = os.path.relpath(ii, start=root_path)
- zip_obj.write(ii, arcname)
- for root, dirs, files in os.walk(ii):
- for file in files:
- filename = os.path.join(root, file)
- arcname = os.path.relpath(filename, start=root_path)
- # print('debug: filename:arcname:root_path', filename, arcname, root_path)
- zip_obj.write(filename, arcname)
- else:
- arcname = os.path.relpath(ii, start=root_path)
- zip_obj.write(ii, arcname)
- zip_obj.close()
- return out_zip_file
+ try:
+ with ZipFile(out_zip_file, "w") as zip_obj:
+ for f in file_list:
+ matched_files = os.path.join(root_path, f)
+ for ii in glob.glob(matched_files):
+ if os.path.isdir(ii):
+ arcname = os.path.relpath(ii, start=root_path)
+ zip_obj.write(ii, arcname)
+ for root, dirs, files in os.walk(ii):
+ for file in files:
+ filename = os.path.join(root, file)
+ arcname = os.path.relpath(filename, start=root_path)
+ zip_obj.write(filename, arcname)
+ else:
+ arcname = os.path.relpath(ii, start=root_path)
+ zip_obj.write(ii, arcname)
+ return out_zip_file
+ except Exception as e:
+ dlog.error(f"Failed to create zip file {out_zip_file}: {str(e)}")
+ raise
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def zip_file_list(root_path, zip_filename, file_list=[]): | |
out_zip_file = os.path.join(root_path, zip_filename) | |
# print('debug: file_list', file_list) | |
zip_obj = ZipFile(out_zip_file, "w") | |
for f in file_list: | |
matched_files = os.path.join(root_path, f) | |
for ii in glob.glob(matched_files): | |
# print('debug: matched_files:ii', ii) | |
if os.path.isdir(ii): | |
arcname = os.path.relpath(ii, start=root_path) | |
zip_obj.write(ii, arcname) | |
for root, dirs, files in os.walk(ii): | |
for file in files: | |
filename = os.path.join(root, file) | |
arcname = os.path.relpath(filename, start=root_path) | |
# print('debug: filename:arcname:root_path', filename, arcname, root_path) | |
zip_obj.write(filename, arcname) | |
else: | |
arcname = os.path.relpath(ii, start=root_path) | |
zip_obj.write(ii, arcname) | |
zip_obj.close() | |
return out_zip_file | |
def zip_file_list(root_path, zip_filename, file_list=None): | |
if file_list is None: | |
file_list = [] | |
out_zip_file = os.path.join(root_path, zip_filename) | |
try: | |
with ZipFile(out_zip_file, "w") as zip_obj: | |
for f in file_list: | |
matched_files = os.path.join(root_path, f) | |
for ii in glob.glob(matched_files): | |
if os.path.isdir(ii): | |
arcname = os.path.relpath(ii, start=root_path) | |
zip_obj.write(ii, arcname) | |
for root, dirs, files in os.walk(ii): | |
for file in files: | |
filename = os.path.join(root, file) | |
arcname = os.path.relpath(filename, start=root_path) | |
zip_obj.write(filename, arcname) | |
else: | |
arcname = os.path.relpath(ii, start=root_path) | |
zip_obj.write(ii, arcname) | |
return out_zip_file | |
except Exception as e: | |
dlog.error(f"Failed to create zip file {out_zip_file}: {e}") | |
raise |
🤖 Prompt for AI Agents
In dpdispatcher/contexts/openapi_context.py lines 35 to 56, fix the mutable
default argument by changing file_list's default to None and initializing it
inside the function. Remove all debug print statements. Add error handling
around the file zipping process to catch and log exceptions. Use a context
manager (with statement) for the ZipFile object to ensure it is properly closed
even if errors occur.
def unzip_file(zip_file, out_dir="./"): | ||
obj = ZipFile(zip_file, "r") | ||
for item in obj.namelist(): | ||
obj.extract(item, out_dir) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for zip file operations.
The unzip_file
function should handle potential errors such as corrupted zip files or extraction failures.
def unzip_file(zip_file, out_dir="./"):
- obj = ZipFile(zip_file, "r")
- for item in obj.namelist():
- obj.extract(item, out_dir)
+ try:
+ with ZipFile(zip_file, "r") as obj:
+ for item in obj.namelist():
+ obj.extract(item, out_dir)
+ except Exception as e:
+ dlog.error(f"Failed to unzip file {zip_file}: {str(e)}")
+ raise
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def unzip_file(zip_file, out_dir="./"): | |
obj = ZipFile(zip_file, "r") | |
for item in obj.namelist(): | |
obj.extract(item, out_dir) | |
def unzip_file(zip_file, out_dir="./"): | |
try: | |
with ZipFile(zip_file, "r") as obj: | |
for item in obj.namelist(): | |
obj.extract(item, out_dir) | |
except Exception as e: | |
dlog.error(f"Failed to unzip file {zip_file}: {str(e)}") | |
raise |
🤖 Prompt for AI Agents
In dpdispatcher/machines/openapi.py around lines 25 to 29, the unzip_file
function lacks error handling for zip file operations. Add try-except blocks
around the ZipFile opening and extraction code to catch exceptions like
BadZipFile or extraction errors. Log or raise appropriate errors to handle
corrupted zip files or extraction failures gracefully.
pyproject.toml
Outdated
'httpx', | ||
'distro', | ||
'pyhumps', | ||
'bohrium-sdk' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove redundant bohrium-sdk from optional dependencies.
Since bohrium-sdk
is now a required dependency (line 35), it should be removed from the optional dependencies to avoid redundancy.
-cloudserver = ["oss2", "tqdm", "bohrium-sdk"]
-bohrium = ["oss2", "tqdm", "bohrium-sdk"]
+cloudserver = ["oss2", "tqdm"]
+bohrium = ["oss2", "tqdm"]
Also applies to: 64-65
🤖 Prompt for AI Agents
In pyproject.toml around lines 32 to 35 and also lines 64 to 65, remove
'bohrium-sdk' from the optional dependencies section because it is already
listed as a required dependency on line 35. This will eliminate redundancy and
keep the dependency declarations clear.
🛠️ Refactor suggestion
Consider adding version constraints for new dependencies.
The newly added dependencies lack version constraints, which could lead to compatibility issues in the future. Consider pinning versions or specifying minimum versions for stability.
- 'httpx',
- 'distro',
- 'pyhumps',
- 'bohrium-sdk'
+ 'httpx>=0.24.0',
+ 'distro>=1.8.0',
+ 'pyhumps>=3.8.0',
+ 'bohrium-sdk>=0.1.0'
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
'httpx', | |
'distro', | |
'pyhumps', | |
'bohrium-sdk' | |
'httpx>=0.24.0', | |
'distro>=1.8.0', | |
'pyhumps>=3.8.0', | |
'bohrium-sdk>=0.1.0' |
🤖 Prompt for AI Agents
In pyproject.toml around lines 32 to 35, the newly added dependencies 'httpx',
'distro', 'pyhumps', and 'bohrium-sdk' do not have version constraints
specified. To fix this, update the dependency entries to include either exact
versions or minimum version requirements to ensure compatibility and stability.
Check the latest stable versions of these packages and add appropriate version
specifiers such as '>=x.y.z' or '==x.y.z' for each dependency.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #524 +/- ##
==========================================
- Coverage 47.73% 47.43% -0.30%
==========================================
Files 39 39
Lines 3863 3904 +41
==========================================
+ Hits 1844 1852 +8
- Misses 2019 2052 +33 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
'httpx', | ||
'distro', | ||
'pyhumps' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be claimed in bohrium-sdk. I don't think they should be claimed in dpdispatcher
Summary by CodeRabbit
New Features
Bug Fixes
Chores
httpx
,distro
, andpyhumps
.