-
Notifications
You must be signed in to change notification settings - Fork 0
refactor: Update Ruff configuration and refactor codebase #68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Update Ruff configuration and refactor codebase #68
Conversation
…re specific PEP-8 naming conventions
…ove logging verbosity, add type annotations, extract-functions to create context overhead
Warning Rate limit exceeded@jjjermiah has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 11 minutes and 55 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. WalkthroughThe pull request introduces significant modifications to the Changes
Possibly related PRs
Suggested reviewers
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (6)
src/readii/feature_extraction.py (6)
109-109
: Use 'is None' for None comparisonIn Python, it is recommended to use
is None
when comparing a variable toNone
for clarity and correctness.Apply this diff to update the comparison:
-if pyradiomicsParamFilePath == None: +if pyradiomicsParamFilePath is None:🧰 Tools
🪛 Ruff (0.8.0)
109-109: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
203-203
: Use 'is None' for None comparisonConsistently use
is None
when comparing toNone
to adhere to Python best practices.Apply this diff to update the comparison:
-if pyradiomicsParamFilePath == None: +if pyradiomicsParamFilePath is None:🧰 Tools
🪛 Ruff (0.8.0)
203-203: Comparison to
None
should becond is None
Replace with
cond is None
(E711)
234-234
: Rename unused loop variablesegCount
to_
The loop variable
segCount
is not used within the loop body. Renaming it to_
signals that it is intentionally unused.Apply this diff to rename the unused variable:
-for segCount, segSeriesID in enumerate(segSeriesIDList): +for _, segSeriesID in enumerate(segSeriesIDList):🧰 Tools
🪛 Ruff (0.8.0)
234-234: Loop control variable
segCount
not used within loop bodyRename unused
segCount
to_segCount
(B007)
238-240
: Simplify nestedif
statementsCombine the nested
if
statements into a single condition to improve code readability.Apply this diff to simplify the conditions:
-if len(segSeriesInfo) > 1: - if not segSeriesInfo.duplicated(subset=["series_CT"], keep=False).all(): +if len(segSeriesInfo) > 1 and not segSeriesInfo.duplicated(subset=["series_CT"], keep=False).all():🧰 Tools
🪛 Ruff (0.8.0)
238-240: Use a single
if
statement instead of nestedif
statements(SIM102)
275-315
: Enhance exception handling for dimension mismatchesWhile handling image dimension mismatches, consider providing more informative error messages and ensuring that all exceptions are appropriately caught and logged. This helps in debugging and maintains robustness.
Consider updating the exception handling as follows:
- Catch specific exceptions where possible.
- Provide detailed logging messages that include context about the images involved.
- Ensure that exceptions are re-raised after logging if they should propagate.
137-142
: Improve error message clarity when loading Pyradiomics parametersThe error message when failing to initialize the feature extractor could be more informative by including the path and the original exception.
Apply this diff to enhance the error handling:
try: logger.info("Setting up Pyradiomics feature extractor...") featureExtractor = featureextractor.RadiomicsFeatureExtractor(pyradiomicsParamFilePath) except OSError as e: logger.exception( - f"Supplied pyradiomics parameter file {pyradiomicsParamFilePath} does not exist or is not at that location: {e}" + f"Failed to initialize RadiomicsFeatureExtractor with parameter file '{pyradiomicsParamFilePath}': {e}" ) raise e
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (1)
pixi.lock
is excluded by!**/*.lock
📒 Files selected for processing (2)
config/ruff.toml
(2 hunks)src/readii/feature_extraction.py
(1 hunks)
🧰 Additional context used
📓 Learnings (1)
config/ruff.toml (1)
Learnt from: jjjermiah
PR: bhklab/readii#42
File: config/ruff.toml:19-21
Timestamp: 2024-11-12T13:18:03.441Z
Learning: In the 'readii' Python project, the team is incrementally adding files to the Ruff linter configuration in 'config/ruff.toml' to make small progress, rather than including all source files at once.
🪛 Ruff (0.8.0)
src/readii/feature_extraction.py
109-109: Comparison to None
should be cond is None
Replace with cond is None
(E711)
203-203: Comparison to None
should be cond is None
Replace with cond is None
(E711)
234-234: Loop control variable segCount
not used within loop body
Rename unused segCount
to _segCount
(B007)
238-240: Use a single if
statement instead of nested if
statements
(SIM102)
🔇 Additional comments (5)
src/readii/feature_extraction.py (2)
274-281
:
Verify dimension checks to prevent index errors
Ensure that when accessing roiImage.GetSize()[3]
, the index is within bounds to prevent potential IndexError
if GetDimension()
returns a value less than 4.
Review the condition and adjust it if necessary to safely access the size tuple.
371-375
:
Ensure alignment between series IDs and features collected
When filtering out None
values from features
, ensure that the alignment with ctSeriesIDList
remains correct to prevent mismatches between series IDs and their corresponding features.
Review the list comprehensions and the way features
and ctSeriesIDList
are zipped together to ensure data integrity.
config/ruff.toml (3)
12-17
: Consistency with incremental linting approach
Noting from previous learnings that the team is incrementally adding files to the Ruff configuration, the inclusion of src/readii/feature_extraction.py
aligns with this approach. This method allows for manageable progress in code quality improvements.
100-101
: Verify the necessity of ignoring COM812
and D206
The rules COM812
(missing trailing comma) and D206
(docstring indentation) are being ignored. Ensure that this aligns with the project's coding standards and that disabling these checks won't negatively impact code quality.
Consider whether these rules should remain ignored or if the codebase should be updated to comply with them.
125-127
:
Ensure consistent formatting settings
The indent-style
is set to "tab"
, which might be unconventional in Python, where spaces are typically used for indentation. This could lead to inconsistencies if the rest of the codebase uses spaces.
Confirm whether tabs are the standard for this project. If spaces are preferred, update the setting:
indent-style = "tab"
+# Consider changing to:
+indent-style = "space"
except RuntimeError as e: | ||
logger.error(str(e)) | ||
|
||
# Extract radiomic features from this CT/segmentation pair |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle exceptions after logging errors
After logging the error message, the exception should be raised or appropriately handled to prevent unexpected behavior.
Consider raising the exception or handling it to ensure the program flow is as intended.
include = [ | ||
"src/readii/loaders.py", | ||
"src/readii/feature_extraction.py" | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent inclusion and exclusion of src/readii/loaders.py
The file src/readii/loaders.py
is listed in both the include
and extend-exclude
sections. This may result in the file being excluded from linting even though it's explicitly included.
Consider removing src/readii/loaders.py
from the extend-exclude
section to ensure it is linted as intended.
Also applies to: 13-13
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (4)
src/readii/feature_extraction.py (4)
29-55
: Consider enhancing error handling and input validationThe string parsing logic could be more robust to handle edge cases.
Consider these improvements:
def generateNegativeControl( ctImage: sitk.Image, negativeControl: str, alignedROIImage: sitk.Image, randomSeed: Optional[int], ) -> sitk.Image: + if not negativeControl: + raise ValueError("negativeControl parameter cannot be empty") + if "non_roi" in negativeControl: negativeControlType = negativeControl.rsplit("_", 2)[0] negativeControlRegion = "non_roi" else: negativeControlComponents = negativeControl.rsplit("_", 1) + if len(negativeControlComponents) != 2: + raise ValueError( + f"Invalid negative control format: {negativeControl}. " + "Expected format: <type>_<region> or <type>_non_roi" + ) negativeControlType = negativeControlComponents[0] negativeControlRegion = negativeControlComponents[1]
234-234
: Remove unused loop variableThe loop control variable
segCount
is not used within the loop body.Replace with:
- for segCount, segSeriesID in enumerate(segSeriesIDList): + for _, segSeriesID in enumerate(segSeriesIDList):🧰 Tools
🪛 Ruff (0.8.0)
234-234: Loop control variable
segCount
not used within loop bodyRename unused
segCount
to_segCount
(B007)
238-244
: Simplify nested if statementsThe nested if statements can be combined for better readability.
Consider this improvement:
- if len(segSeriesInfo) > 1: - # Check that if there are multiple rows that it's not due to a CT with subseries (this is fine, the whole series is loaded) - if not segSeriesInfo.duplicated(subset=["series_CT"], keep=False).all(): - raise RuntimeError( - "Some kind of duplication of segmentation and CT matches not being caught. Check seg_and_ct_dicom_list in radiogenomic_output." - ) + if len(segSeriesInfo) > 1 and not segSeriesInfo.duplicated(subset=["series_CT"], keep=False).all(): + raise RuntimeError( + "Some kind of duplication of segmentation and CT matches not being caught. Check seg_and_ct_dicom_list in radiogenomic_output." + )🧰 Tools
🪛 Ruff (0.8.0)
238-240: Use a single
if
statement instead of nestedif
statements(SIM102)
275-315
: Improve error handling and loggingThe error handling in the dimension mismatch section could be more informative and structured.
Consider these improvements:
- Add more context to error messages
- Use structured logging
- Consolidate the dimension checks
try: # Check if segmentation just has an extra axis with a size of 1 and remove it if roiImage.GetDimension() > 3 and roiImage.GetSize()[3] == 1: + logger.debug("Removing extra dimension from ROI image") roiImage = flattenImage(roiImage) # Check that image and segmentation mask have the same dimensions if ctImage.GetSize() != roiImage.GetSize(): + logger.debug( + "Dimension mismatch", + ct_size=ctImage.GetSize(), + roi_size=roiImage.GetSize() + ) # Checking if number of segmentation slices is less than CT if ctImage.GetSize()[2] > roiImage.GetSize()[2]: logger.warning( - f"Slice number mismatch between CT and segmentation for {patID}." - f"ctImage.GetSize(): {ctImage.GetSize()}" - f"roiImage.GetSize(): {roiImage.GetSize()}" - "Padding segmentation to match." + "Slice number mismatch between CT and segmentation", + patient_id=patID, + ct_size=ctImage.GetSize(), + roi_size=roiImage.GetSize(), + action="padding segmentation" )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
src/readii/feature_extraction.py
(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.0)
src/readii/feature_extraction.py
105-105: Comparison to None
should be cond is None
Replace with cond is None
(E711)
203-203: Comparison to None
should be cond is None
Replace with cond is None
(E711)
234-234: Loop control variable segCount
not used within loop body
Rename unused segCount
to _segCount
(B007)
238-240: Use a single if
statement instead of nested if
statements
(SIM102)
🔇 Additional comments (1)
src/readii/feature_extraction.py (1)
58-73
: 🛠️ Refactor suggestion
Enhance optional parameter handling and logging
The function should handle optional parameters more explicitly and provide more detailed logging.
Consider these improvements:
def cropImageAndMask(
ctImage: sitk.Image,
alignedROIImage: sitk.Image,
segBoundingBox: tuple,
negativeControl: Optional[str],
randomSeed: Optional[int],
) -> tuple[sitk.Image, sitk.Image]:
+ logger.debug(
+ "Cropping image and mask",
+ negative_control=negativeControl,
+ random_seed=randomSeed
+ )
+
if negativeControl:
+ if not isinstance(negativeControl, str):
+ raise TypeError("negativeControl must be a string when provided")
logger.info(f"Generating {negativeControl} negative control for CT.")
ctImage = generateNegativeControl(ctImage, negativeControl, alignedROIImage, randomSeed)
+ try:
croppedCT, croppedROI = imageoperations.cropToTumorMask(
ctImage, alignedROIImage, segBoundingBox
)
+ except Exception as e:
+ logger.error("Failed to crop image and mask", error=str(e))
+ raise
return croppedCT, croppedROI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (4)
.github/workflows/ci-cd.yml (3)
Line range hint
15-17
: Consider adding Windows back to the test matrixThe comment indicates Windows was removed "for now". Consider:
- Adding a TODO comment explaining why Windows was removed
- Creating an issue to track re-enabling Windows support
- Document any known Windows-specific issues
Would you like me to help create an issue to track Windows support re-enablement?
31-32
: Consider locking dependencies in CIThe
locked: false
setting might lead to inconsistent builds due to different dependency versions being used across different environments. Consider:
- Setting
locked: true
for reproducible builds- Using a lockfile to ensure consistent dependencies
Line range hint
34-37
: Consider adding test result artifactsWhile coverage reports are being uploaded, consider also uploading test results for better debugging of failures, especially for matrix builds.
Add the following step after running tests:
- name: Run pytest run: | pixi run -e ${{ matrix.env }} test + + - name: Upload test results + if: always() # Upload even if tests fail + uses: actions/upload-artifact@v4 + with: + name: test-results-${{ matrix.os }}-${{ matrix.env }} + path: | + test-results/ + pytest-report.xmlconfig/ruff.toml (1)
130-152
: Clean up commented configuration and preserve naming convention noteThe commented-out old configuration adds unnecessary noise to the file. However, the note about camelCase naming conventions is important.
Suggested changes:
- Remove the old configuration comments
- Move the camelCase note to the top of the file or near relevant rules
-# [lint] # commented for now, will iterate on this later -# select = [ -# "E", -# "F", -# "W", # flake8 -# "C", # mccabe -# "I", # isort -# "D", # pydocstyle -# # "N", # pep8-naming -# "ANN", # flake8-annotations -# "BLE", # flake8-blind-except -# "B", # flake8-bugbear -# "A", # flake8-builtins -# # "G", # flake8-logging-format -# "ERA", # eradicate -# "RUF", # Ruff-specific rules -# "TCH", # flake8-type-checking -# ] -# ignore = ["ANN101"] - -# Readii uses a lot of camelcase so ignoring pep-8 conventions. -# https://docs.astral.sh/ruff/rules/#pep8-naming-n -# "N", +# Note: Readii uses camelCase naming conventions, so we ignore PEP-8 naming rules ("N") +# See: https://docs.astral.sh/ruff/rules/#pep8-naming-n
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
.github/workflows/ci-cd.yml
(5 hunks)config/ruff.toml
(2 hunks)
🧰 Additional context used
📓 Learnings (1)
config/ruff.toml (1)
Learnt from: jjjermiah
PR: bhklab/readii#42
File: config/ruff.toml:19-21
Timestamp: 2024-11-12T13:18:03.441Z
Learning: In the 'readii' Python project, the team is incrementally adding files to the Ruff linter configuration in 'config/ruff.toml' to make small progress, rather than including all source files at once.
🔇 Additional comments (3)
.github/workflows/ci-cd.yml (1)
29-29
: Verify compatibility with Pixi v0.38.0 across all environments
The Pixi version has been updated from v0.33.0 to v0.38.0 across all jobs. While keeping versions consistent is good practice, this is a significant version jump that warrants verification.
Also applies to: 87-87, 142-142, 187-187, 218-218
✅ Verification successful
Pixi version update appears safe and brings valuable improvements
The update from v0.33.0 to v0.38.0 is safe to proceed with as:
- Both versions support all major platforms (Windows, Linux, macOS) and architectures (x86_64, aarch64)
- v0.38.0 primarily adds new features (PyPI index specification, PEP735 support) and fixes bugs
- No breaking changes are introduced between v0.33.0 and v0.38.0
- The only breaking changes were in v0.33.0 itself, affecting
pixi global
commands which are not used in this workflow
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check Pixi version compatibility and changelog
# Verify if there are any breaking changes between v0.33.0 and v0.38.0
# Check the releases and their dates
gh release view v0.38.0 --repo prefix-dev/pixi
gh release view v0.33.0 --repo prefix-dev/pixi
# Search for any known issues with v0.38.0
gh issue list --repo prefix-dev/pixi --search "v0.38.0 in:title,body"
Length of output: 6641
config/ruff.toml (2)
1-15
:
Resolve file inclusion/exclusion conflict and consider broader inclusion strategy
The current configuration has potential issues:
src/readii/loaders.py
appears in bothinclude
andextend-exclude
(via its parent directory), which may lead to unexpected behavior- The limited inclusion of only two specific files differs from the project's goal of incremental linting adoption
Consider this alternative approach:
include = [
- "src/readii/loaders.py",
- "src/readii/feature_extraction.py"
+ "src/readii/*.py" # Include all Python files in readii
]
extend-exclude = [
"docs/*",
"tests/*",
".pixi/",
- "src/readii/image_processing.py",
- "src/readii/metadata.py",
- "src/readii/negative_controls.py",
- "src/readii/pipeline.py",
+ # Temporarily exclude files not ready for linting
+ "src/readii/image_processing.py", # TODO: Enable when ready
+ "src/readii/pipeline.py", # TODO: Enable when ready
]
127-127
: 🛠️ Refactor suggestion
Review restrictive docstring code line length
The docstring-code-line-length = 20
setting seems unusually restrictive and might cause unnecessary line breaks in code examples within docstrings. Consider increasing this value to match or be closer to the main line-length = 100
setting.
Let's verify if this causes issues in existing docstrings:
[\s\S]?[\s\S]*?"""' | while read -r file; do echo "=== $file ===" rg -U '"""[\s\S]*?
[\s\S]?```[\s\S]*?"""' "$file"
done
</details>
</details>
</details>
<!-- This is an auto-generated comment by CodeRabbit for review status -->
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #68 +/- ##
==========================================
+ Coverage 59.14% 60.40% +1.26%
==========================================
Files 10 10
Lines 536 543 +7
==========================================
+ Hits 317 328 +11
+ Misses 219 215 -4 ☔ View full report in Codecov by Sentry. |
Enhance the Ruff configuration to include additional files and ignore specific PEP-8 naming conventions. Refactor the codebase by replacing
os.path
withpathlib
, improving logging verbosity, adding type annotations, and organizing imports.Summary by CodeRabbit