Skip to content

feat: added wrapper annotation to workflow catalog; closes #42 #56

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 23, 2025

Conversation

m-jahn
Copy link
Member

@m-jahn m-jahn commented Jul 18, 2025

  • added 1 more field wrappers on the individual workflow page
  • wrappers are now searched and parsed from the Snakefile and .smk files
  • wrappers and topics show now only up on the page if there are some entries, avoiding empty page space
  • closes new feature: add package references for workflows automatically #42 by referencing wrappers instead of full tool citations (will not be implemented short term)

Summary by CodeRabbit

  • New Features
    • Added display of wrapper badges with links to documentation on workflow pages.
    • Workflow metadata now includes information about wrappers used in each workflow.

Copy link
Contributor

coderabbitai bot commented Jul 18, 2025

📝 Walkthrough

"""

Walkthrough

The changes introduce functionality to extract and display "wrapper" metadata from Snakemake workflow files. A new function parses wrapper usage, and this information is collected, stored, and rendered in workflow catalog pages. Modifications span data extraction, storage in repository objects, and conditional rendering in documentation templates.

Changes

File(s) Change Summary
scripts/common.py Added get_wrappers function to extract wrapper metadata from Snakemake files.
scripts/generate-catalog.py Integrated get_wrappers to collect wrappers from workflow files; updated Repo class and usage.
source/build_wf_pages.py Added inclusion of wrappers metadata in workflow data for page rendering.
source/_templates/workflow_page.md Updated template to conditionally render "Wrappers" badges and improved "Topics" badge display.

Sequence Diagram(s)

sequenceDiagram
    participant CatalogGen as generate-catalog.py
    participant Common as common.py
    participant RepoObj as Repo class
    participant BuildPages as build_wf_pages.py
    participant Template as workflow_page.md

    CatalogGen->>Common: get_wrappers(smkfile_content)
    Common-->>CatalogGen: {wrapper_name: {name, version, url}, ...}
    CatalogGen->>RepoObj: Repo(..., wrappers=wrappers_dict)
    BuildPages->>RepoObj: Access wrappers via repo.get("wrappers", {})
    BuildPages->>Template: Render page with wrappers metadata
    Template-->>User: Display wrappers as badges (if present)
Loading

Estimated code review effort

2 (~20 minutes)

Suggested reviewers

  • johanneskoester
  • cmeesters
    """

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1e2248a and 47256ac.

📒 Files selected for processing (1)
  • scripts/common.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • scripts/common.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: generate-catalog
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
scripts/common.py (1)

88-105: Consider improving regex pattern and error handling.

The function implementation looks good overall, but there are a few potential improvements:

  1. The regex pattern r"\n\s*wrapper:\s*(.+)\n" requires newlines before and after the wrapper declaration, which might miss wrappers at the beginning or end of files.
  2. The hard-coded URL pattern assumes a specific documentation structure.

Consider applying this improvement to handle edge cases:

-def get_wrappers(smkfile):
+def get_wrappers(smkfile):
+    # Ensure the content has leading/trailing newlines for consistent regex matching
+    smkfile = f"\n{smkfile}\n"
     wrappers = {}
     for match in re.finditer(r"\n\s*wrapper:\s*(.+)\n", smkfile):

This ensures wrappers at the start or end of files are properly detected.

scripts/generate-catalog.py (1)

295-307: Good implementation with minor improvement suggestions.

The wrapper extraction logic is well-implemented with proper error handling and merging of results from multiple files.

Consider making the exception handling more specific:

-            except Exception as e:
+            except (IOError, OSError, UnicodeDecodeError) as e:

This catches the most likely file reading errors while avoiding overly broad exception handling.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3fc1662 and 8b37648.

📒 Files selected for processing (4)
  • scripts/common.py (2 hunks)
  • scripts/generate-catalog.py (6 hunks)
  • source/_templates/workflow_page.md (1 hunks)
  • source/build_wf_pages.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

Instructions used from:

Sources:
⚙️ CodeRabbit Configuration File

🧠 Learnings (3)
📓 Common learnings
Learnt from: m-jahn
PR: snakemake/snakemake-workflow-catalog#50
File: topics.json:0-0
Timestamp: 2025-06-10T12:10:20.483Z
Learning: In the snakemake-workflow-catalog project, m-jahn prefers to allow cross-topic keyword mappings in topics.json, where keywords can appear in multiple topic categories. This design decision enables workflows to be discoverable under all relevant topic areas, improving user experience even though it means some workflows will be categorized under multiple topics.
source/build_wf_pages.py (1)
Learnt from: m-jahn
PR: snakemake/snakemake-workflow-catalog#33
File: source/build_wf_tables.py:51-51
Timestamp: 2025-03-14T15:04:47.868Z
Learning: In the build_wf_tables.py file, the repository list is expected to never be empty during normal operation, and an empty list would indicate an upstream problem with repository data collection that should be addressed separately.
scripts/generate-catalog.py (1)
Learnt from: m-jahn
PR: snakemake/snakemake-workflow-catalog#33
File: source/build_wf_tables.py:51-51
Timestamp: 2025-03-14T15:04:47.868Z
Learning: In the build_wf_tables.py file, the repository list is expected to never be empty during normal operation, and an empty list would indicate an upstream problem with repository data collection that should be addressed separately.
🧬 Code Graph Analysis (1)
scripts/generate-catalog.py (1)
scripts/common.py (1)
  • get_wrappers (88-105)
🪛 Ruff (0.12.2)
scripts/generate-catalog.py

200-200: Function definition does not bind loop variable tarball_url

(B023)

🔇 Additional comments (7)
scripts/common.py (1)

7-7: LGTM: Import added for regex functionality.

The re module import is correctly added to support the new get_wrappers function.

source/build_wf_pages.py (1)

75-75: LGTM: Wrappers data integration follows existing patterns.

The addition of wrappers data to the workflow page generation is implemented correctly, using defensive programming with .get() and an empty dictionary default.

source/_templates/workflow_page.md (2)

43-48: LGTM: Topics section properly conditionally rendered.

The conditional rendering for topics badges is implemented correctly, preventing empty sections when no topics exist.


51-56: LGTM: Wrappers section well-implemented with linked badges.

The wrappers section is properly implemented with:

  • Conditional rendering to avoid empty sections
  • Linked badges using bdg-link-secondary with documentation URLs
  • Proper access to nested dictionary structure for wrapper metadata
scripts/generate-catalog.py (3)

20-20: LGTM: Import added for wrapper extraction.

The import of get_wrappers from the common module is correct and necessary for the new functionality.


59-59: LGTM: Repo class properly extended with wrappers support.

The Repo class correctly adds the wrappers parameter to the constructor and stores it as an instance attribute, following the existing pattern.

Also applies to: 71-71


319-319: LGTM: Wrappers data properly passed to Repo constructor.

The wrappers data is correctly passed to the Repo constructor, completing the integration of wrapper functionality.

@m-jahn
Copy link
Member Author

m-jahn commented Jul 22, 2025

example workflow page with new links to wrappers in the center.
image

Copy link
Member

@dlaehnemann dlaehnemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really cool idea, great to have those linkouts in the snakemake workflow catalog. I have two parsing / check questions.

And I guess a comment on how the URLs will not match older wrapper versions, but we probably shouldn't try to fix this over here, but rather in the wrapper catalog docs...

@m-jahn
Copy link
Member Author

m-jahn commented Jul 23, 2025

Hi David @dlaehnemann, good comments! I will check.

Copy link
Member

@dlaehnemann dlaehnemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the comments. Looks good to go for me.

@m-jahn m-jahn merged commit 0b8b420 into snakemake:main Jul 23, 2025
3 checks passed
@m-jahn m-jahn deleted the wrappers branch July 23, 2025 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

new feature: add package references for workflows automatically
2 participants