You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Aug 10, 2024. It is now read-only.
I just stumbled upon a feed that uses chars in the range \0x01 - \0x1F (CDATA description).
Although libxml2 isn't supposed to handle this, RSParser will break early and drop the remaining feed articles. When parsing the RSS below, only the first two items will be returned.
It should be enough to regex and replace these, however, I was wondering if there is a libxml2 flag that could be used instead…
<?xml version="1.0" encoding="UTF-8"?><rss version="2.0">
<channel>
<title>Feed Title</title>
<item>
<title>1</title>
<link>http://someurl.com/1/</link>
<description><![CDATA[Description of first]]></description>
</item>
<item>
<title>2</title>
<link>http://someurl.com/2/</link>
<description><![CDATA[Description with � \0x04 values]]></description>
</item>
<item>
<title>3</title>
<link>http://someurl.com/3/</link>
<description><![CDATA[Description of third]]></description>
</item>
<item>
<title>4</title>
<link>http://someurl.com/4/</link>
<description><![CDATA[Description of fourth]]></description>
</item>
<item>
<title>5</title>
<link>http://someurl.com/5/</link>
<description><![CDATA[Description of fifth]]></description>
</item>
</channel>
</rss>