-
Notifications
You must be signed in to change notification settings - Fork 3
[WD-22501] feat: include and parse markdown files in project trees on content system #210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9d12a24 to
cd1f660
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for including and parsing Markdown (.md) files in the content system’s project tree and updates developer tooling.
- Extend
parse_tree.pyto recognizeindex.md, extract metadata blocks from Markdown, and surface file extensions. - Introduce a
MARKDOWN_TEMPLATESwhitelist and wrapper‐template parsing inis_valid_page. - Add
lint-pythonandformat-pythonscripts topackage.json.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| webapp/parse_tree.py | Detect .md indexes, parse metadata from Markdown files, track extension in node objects |
| package.json | Add Python linting/formatting commands and a trailing comma for dev |
Comments suppressed due to low confidence (3)
webapp/parse_tree.py:12
- There are no tests covering the new Markdown parsing behavior (e.g.,
get_tags_rolling_bufferon.mdfiles andis_valid_pagewrapper logic). Consider adding unit tests for those scenarios.
MARKDOWN_TEMPLATES = [
webapp/parse_tree.py:253
- The original
is_valid_pagecheck forpath.is_file()andis_partial(path)was removed, allowing directories and partial templates to be treated as pages. Re-add those checks before filtering templates to ensure only real page files are accepted.
if is_template(path):
webapp/parse_tree.py:223
- In the Markdown branch of
get_tags_rolling_buffer, the file pointer isn't reset before each variant search. You should callf.seek(0)before reading lines for every variant to ensure all tags are detected.
for line in f:
|
Inspect code takes me to https://github.com/canonical/ubuntu.com/tree/main/templates/appliance/adguard/index.html |
|
https://github.com/canonical/ubuntu.com/blob/6531a47a3df7ef927941c24c9b38b58930660199/templates/legal/motd.md?plain=1#L4 |
If you check those files in your local repository (/repositories/ubuntu.com/...), you will most probably find those files to have run into API rate limit, and thus they have no content and not treated as valid markdown files. Can you please check and confirm this?
It's the same for all; either it should be an index file or must be extending a valid index file. It shouldn't be a partial or shared file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works well, thanks for your work.
|
🎉 This PR is included in version 1.0.0 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |




Problem Statement
Currently, the content system only supports and lists .html files. Some of our pages on various sites are markdown files. We need to be able to show those markdown pages as well, for content team.
Done
QA
QA steps
In a different terminal, run
dotrun exec celery -A webapp.app.celery_app worker -B --loglevel=DEBUGIn another terminal, run
Fixes