Skip to content

Add beautiful_soup_parser option #206

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 29, 2025

Conversation

vincentkelleher
Copy link
Contributor

Hi 👋

As per my issue #205 and #58, I propose the following changes to allow users to choose which parser they want to use with BeautifulSoup 😇

As a quick refresher, at the moment, only html.parser is available as the default HTML parser when using the markdownify(..) method. It's still possible to use the MarkdownConverter(options).convert_soup(..) method but having both solution si more convenient IMHO.

I haven't written any specific test for other kinds of HTML parsers as I didn't want to add extra dependencies to this project. If required I could add development dependencies only to run those specific tests.

Thanks @AlexVonB for your help with this subject 🙏

@AlexVonB
Copy link
Collaborator

Hey Vincent, thanks for your PR! Looks great! Could you please add the parameter to the command line interface? https://github.com/matthewwithanm/python-markdownify/blob/develop/markdownify/main.py As soon as this is added I will merge this. Thanks again!

@vincentkelleher
Copy link
Contributor Author

Hi @AlexVonB,

Thank you for your feedback, I've just taken it into account.

I used the following commands to test out the new argument in the CLI:

pip install ".[dev]"
pip install html5lib

markdownify -p html5lib <HTML_FILE>

Everything works fine with the examples of #205 👍

@AlexVonB AlexVonB merged commit 2d654a6 into matthewwithanm:develop Mar 29, 2025
1 check passed
@AlexVonB
Copy link
Collaborator

Thanks a lot! 🥳

Wuhall pushed a commit to Wuhall/python-markdownify that referenced this pull request May 21, 2025
* add beautiful_soup_parser option
* add Beautiful Soup parser argument to command line

---------

Co-authored-by: Vincent Kelleher <vincent.kelleher-ext@francetravail.fr>
Co-authored-by: AlexVonB <AlexVonB@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants