You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The goal
My HTML contains <select> form controls. The parser extracts the text for every <option> in the menu. I want it to only extract the control as it's displayed - i.e. the <option selected>
Is there a configuration option that supports this? I can't find one on the docs.
Example:
<div>You have selected:</div><select><option>A</option><option>B</option><optionselected>C</option></select>
Currently this outputs as:
You have selected: A B C
Desired output
You have selected: C
Best attempt
I can try to preprocess the HTML in a DOM parser to remove the other options from the menu prior to handing it to html-to-text
The text was updated successfully, but these errors were encountered:
I'd suggest to create a custom formatter for select tags.
In a formatter you have access to children nodes, you can inspect them and pick whatever you need instead of calling the walk function.
Start from this Readme section. elem is a DOM Element as parsed by htmlparser2. You can use astexplorer to see how it represents tags and attributes of interest.
Of built-in formatters, list and table formatters handle all the children tags on their own, although they are a lot more complex than the formatter for select is going to be.
I haven't implemented formatters for form tags myself yet because it seems rather rare use-case for html-to-text for the amount of code to support. But with requests like this coming I may reprioritize it. So thanks for asking.
The goal
My HTML contains
<select>
form controls. The parser extracts the text for every<option>
in the menu. I want it to only extract the control as it's displayed - i.e. the<option selected>
Is there a configuration option that supports this? I can't find one on the docs.
Example:
Currently this outputs as:
Desired output
Best attempt
I can try to preprocess the HTML in a DOM parser to remove the other options from the menu prior to handing it to
html-to-text
The text was updated successfully, but these errors were encountered: