Skip to content

Fix #347: Preserve HTML entities like © during processing #425

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 21, 2025

Conversation

devin-ai-integration[bot]
Copy link
Contributor

Fix HTML Entities Conversion Issue (#347)

Problem

When processing HTML text containing HTML entities, such as ©, PreMailer.Net converts them to the corresponding characters (such as ©). This causes display issues in Gmail, which shows a notice that the message was trimmed when it encounters the © character, but displays correctly when using the © entity.

Root Cause

AngleSharp's HTML parser converts HTML entities to characters during parsing, and the default HtmlMarkupFormatter only re-encodes specific characters (&, <, >, ", etc.) during serialization. For other special characters like the copyright symbol (©), it doesn't convert them back to their entity form.

Solution

  1. Created a custom PreserveEntitiesHtmlMarkupFormatter that extends AngleSharp's HtmlMarkupFormatter to convert special characters back to their HTML entity equivalents during serialization.
  2. Added a new preserveEntities parameter to all MoveCssInline method overloads to control whether HTML entities should be preserved.
  3. When preserveEntities is set to true, the custom formatter is used instead of the default one.

Testing

  • Added unit tests for the custom formatter to verify that it correctly preserves HTML entities.
  • Added integration tests for the MoveCssInline method to verify that the preserveEntities parameter works correctly.
  • All existing tests continue to pass, ensuring no regressions were introduced.

Usage

// Preserve HTML entities like &copy;
var result = PreMailer.MoveCssInline(html, preserveEntities: true);

Link to Devin run

Requested by: m@martinnormark.com

Co-Authored-By: m@martinnormark.com <m@martinnormark.com>
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Co-Authored-By: m@martinnormark.com <m@martinnormark.com>
@martinnormark martinnormark merged commit 7af9511 into main May 21, 2025
2 checks passed
@martinnormark martinnormark deleted the devin/1747844860-fix-html-entities-issue-347 branch May 21, 2025 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant