Skip to content

Conversation

bspratt
Copy link
Member

@bspratt bspratt commented Sep 10, 2024

No description provided.

@bspratt bspratt requested a review from chambm September 10, 2024 22:04
@brendanx67
Copy link
Contributor

brendanx67 commented Sep 10, 2024

This seems overly simplistic. What if there is an entity in the XML? This would double escape the ampersand in the entity. Ideally, you would be looking for an ampersand in a specific attribute and even checking for a semicolon within 5 characters (e.g. ")

@brendanx67
Copy link
Contributor

You don't want to create an instance of "

@bspratt
Copy link
Member Author

bspratt commented Sep 10, 2024

Agreed, the Regex is looking only for unescaped ampersands: "&(?!amp;|lt;|gt;|quot;|apos;)"

@bspratt
Copy link
Member Author

bspratt commented Sep 10, 2024

image

@brendanx67
Copy link
Contributor

Agreed, the Regex is looking only for unescaped ampersands: "&(?!amp;|lt;|gt;|quot;|apos;)"

I missed that subtlety. Thought it was a string.replace().

That covers the named entities. Maybe that is good enough. For completeness you would have to include:

&#\d\d;
&#x\x\x;

@bspratt
Copy link
Member Author

bspratt commented Sep 10, 2024

OK, added those and made the whole thing less subtle, hopefully

Copy link
Member

@chambm chambm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@bspratt bspratt merged commit d464da9 into master Sep 11, 2024
10 checks passed
@bspratt bspratt deleted the Skyline/work/20240910_fix_unescaped_xml_msfragger branch September 11, 2024 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants