Skip to content

Conversation

@muhammad-ali-pk
Copy link
Contributor

@muhammad-ali-pk muhammad-ali-pk commented Jun 25, 2025

Problem Statement

Currently, the content system only supports and lists .html files. Some of our pages on various sites are markdown files. We need to be able to show those markdown pages as well, for content team.

Done

  • Added .md files in sites tree
  • Parse .md files to extract metadata e.g., title, description and copydoc link.

QA

QA steps

  • Check out this repo
  • Run the project
docker run -d -p 5432:5432 -e POSTGRES_PASSWORD=postgres postgres
docker run -d -p 6379:6379 redis

dotrun

In a different terminal, run

dotrun exec celery -A webapp.app.celery_app worker -B  --loglevel=DEBUG

In another terminal, run

yarn dev
  • Access the application on localhost:8104/app
  • Wait for the projects to load
  • Select Table View > ubuntu.com
  • Verify the following pages are listed, and are accessible
  1. /appliance
  2. /appliance/adguard
  3. /appliance/openhab
  4. /appliance/lxd
  • Verify the title, description and copydoc links (if exists)

Fixes

@webteam-app
Copy link

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for including and parsing Markdown (.md) files in the content system’s project tree and updates developer tooling.

  • Extend parse_tree.py to recognize index.md, extract metadata blocks from Markdown, and surface file extensions.
  • Introduce a MARKDOWN_TEMPLATES whitelist and wrapper‐template parsing in is_valid_page.
  • Add lint-python and format-python scripts to package.json.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
webapp/parse_tree.py Detect .md indexes, parse metadata from Markdown files, track extension in node objects
package.json Add Python linting/formatting commands and a trailing comma for dev
Comments suppressed due to low confidence (3)

webapp/parse_tree.py:12

  • There are no tests covering the new Markdown parsing behavior (e.g., get_tags_rolling_buffer on .md files and is_valid_page wrapper logic). Consider adding unit tests for those scenarios.
MARKDOWN_TEMPLATES = [

webapp/parse_tree.py:253

  • The original is_valid_page check for path.is_file() and is_partial(path) was removed, allowing directories and partial templates to be treated as pages. Re-add those checks before filtering templates to ensure only real page files are accepted.
    if is_template(path):

webapp/parse_tree.py:223

  • In the Markdown branch of get_tags_rolling_buffer, the file pointer isn't reset before each variant search. You should call f.seek(0) before reading lines for every variant to ensure all tags are detected.
                    for line in f:

@immortalcodes
Copy link
Member

immortalcodes commented Jul 2, 2025

Was able to load up MD files 👍

image

@immortalcodes
Copy link
Member

Inspect code takes me to https://github.com/canonical/ubuntu.com/tree/main/templates/appliance/adguard/index.html
which I guess should be .md
image

@immortalcodes
Copy link
Member

immortalcodes commented Jul 2, 2025

https://github.com/canonical/ubuntu.com/blob/6531a47a3df7ef927941c24c9b38b58930660199/templates/legal/motd.md?plain=1#L4
Should this be part of pages?
I couldn't see this one on list
image
There are several other md files which are not part of tree, just wanted to know what is criteria of keeping and not keeping a page?

@muhammad-ali-pk
Copy link
Contributor Author

There are several other md files which are not part of tree

If you check those files in your local repository (/repositories/ubuntu.com/...), you will most probably find those files to have run into API rate limit, and thus they have no content and not treated as valid markdown files.

image

Can you please check and confirm this?

just wanted to know what is criteria of keeping and not keeping a page?

It's the same for all; either it should be an index file or must be extending a valid index file. It shouldn't be a partial or shared file.

Copy link
Member

@immortalcodes immortalcodes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works well, thanks for your work.

@muhammad-ali-pk muhammad-ali-pk merged commit 146cc0c into main Jul 4, 2025
8 of 9 checks passed
@muhammad-ali-pk muhammad-ali-pk deleted the WD-22501 branch July 4, 2025 05:28
@github-actions
Copy link

github-actions bot commented Sep 5, 2025

🎉 This PR is included in version 1.0.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants