Skip to content

Commit a67871e

Browse files
committed
Improve UI descriptions for annotated text extraction and sorting
- Basic explanation now tries to educate the user more about what problem the feature solves - Advanced help and tips were improved for more clarity - Improved formatting to be in line with descriptions of other forms
1 parent be33519 commit a67871e

File tree

2 files changed

+34
-32
lines changed

2 files changed

+34
-32
lines changed

changedetectionio/forms.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -688,10 +688,10 @@ class processor_text_json_diff_form(commonSettingsForm):
688688
remove_duplicate_lines = BooleanField('Remove duplicate lines of text', default=False)
689689
sort_text_alphabetically = BooleanField('Sort text alphabetically', default=False)
690690
trim_text_whitespace = BooleanField('Trim whitespace before and after text', default=False)
691-
extraction_method = RadioField('Extraction method', choices=[('TEXT', 'Extract text only'),('ANNOTATED_TEXT', 'Extract annotated text')], default='TEXT')
692-
annotation_rules = StringSelectorTagDictField('Annotation Rules', [validators.Optional()])
691+
extraction_method = RadioField('Extraction method', choices=[('TEXT', 'Extract plain text'),('ANNOTATED_TEXT', 'Extract text with custom annotations')], default='TEXT')
692+
annotation_rules = StringSelectorTagDictField('Annotation rules', [validators.Optional()])
693693

694-
annotated_sort_selectors = StringSelectorPairListField('Sort Annotated text by matched tags', [validators.Optional()])
694+
annotated_sort_selectors = StringSelectorPairListField('Sort annotated text', [validators.Optional()])
695695

696696
filter_text_added = BooleanField('Added lines', default=True)
697697
filter_text_replaced = BooleanField('Replaced/changed lines', default=True)

changedetectionio/templates/edit.html

Lines changed: 31 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -384,25 +384,24 @@ <h2 >Click here to Start</h2>
384384
<fieldset class="pure-control-group">
385385
<div class="pure-control-group inline-radio" class="extraction_method">
386386
{{ render_field(form.extraction_method) }}
387+
<span class="pure-form-message-inline">Plain text produces raw text, while annotated text can preserve semantics found in original HTML source.</span>
387388
</div>
388389
</fieldset>
389390
<fieldset class="pure-control-group" data-visible-for="extraction_method=ANNOTATED_TEXT" >
390391
<div class="pure-control-group">
391-
{{ render_field(form.annotation_rules, rows=4, placeholder='{<CSS Selector>} <Comma-separated list of tags>
392-
{#example} example
393-
{tr[class*=\'submission\']} submission, tablerow
394-
{span.age > a} time') }}
395-
<div class="pure-form-message">Pair CSS selector and tag to annotate matches in the text per line.</div>
396-
<span data-target="#advanced-help-annotation-rules" class="toggle-show pure-button button-tag button-xsmall">Show advanced help and tips</span><br>
397-
<div id="advanced-help-annotation-rules" class="pure-form-message-inline" style="display: none;">
398-
<ul>
399-
<li> Syntax per Line: <code>{&lt;CSS Selector&gt;} &lt;Comma-separated list of tags&gt;</code> </li>
400-
<li> Each element matched by the CSS selector will be tagged in the text with all listed tags <code>&lt;foo&gt;&lt;bar&gt;Matched Text&lt;/bar&gt;&lt;/foo&gt;</code> </li>
401-
<li> Matched elements without any text will not be tagged. </li>
402-
<li> Tags will be matched after Include Filters were applied and elements were removed. </li>
403-
<li> Only available for HTML (no Json or pdf). </li>
392+
{{ render_field(form.annotation_rules, rows=4, placeholder='{#example} example
393+
{tr[class*=\'news-article\']} article
394+
{span.title} title
395+
{span.age > a} date') }}
396+
<span class="pure-form-message-inline">Annotations map HTML elements to XML tags in the extracted text, preserving semantics found in the original HTML source (like title, author, price, or dates) that allow more effective downstream processing.<br>
397+
<span data-target="#advanced-help-annotation-rules" class="toggle-show pure-button button-tag button-xsmall">Show advanced help and tips</span><br>
398+
<ul id="advanced-help-annotation-rules" style="display: none;">
399+
<li> Enter one rule per line: <code>{&lt;CSS Selector&gt;} &lt;Comma-separated list of tags&gt;</code> </li>
400+
<li> Each matched element's text is wrapped in tags e.g <code>&lt;example&gt;Text&lt;/example&gt;</code> </li>
401+
<li> Annotations happen after filters to include or remove elements. </li>
402+
<li> Available only for HTML extraction. </li>
404403
</ul>
405-
</div>
404+
</span>
406405
</div>
407406
</fieldset>
408407
<div class="text-filtering border-fieldset">
@@ -434,21 +433,24 @@ <h3>Text filtering</h3>
434433
</fieldset>
435434
<fieldset class="pure-control-group" data-visible-for="extraction_method=ANNOTATED_TEXT" >
436435
<div class="pure-control-group">
437-
{{ render_field(form.annotated_sort_selectors, rows=4, placeholder='{<Element-to-Sort Selector>}({<Sort-Identifier Selector>})
438-
{item}{item-price}
439-
{title}
440-
{submission}{time}') }}
441-
<div class="pure-form-message">Sort tags in annotated text based on its contents or child tag contents.</div>
442-
<span data-target="#advanced-help-annotated-sort" class="toggle-show pure-button button-tag button-xsmall">Show advanced help and tips</span><br>
443-
<div id="advanced-help-annotated-sort" class="pure-form-message-inline" style="display: none;">
444-
<ul>
445-
<li> Syntax per Line: <code>{&lt;Element-to-Sort Selector&gt;}({&lt;Sort-Identifier Selector&gt;})</code></li>
446-
<li> <strong>Element-to-Sort</strong> CSS or XPath Selector matching the annotated tag <strong>to be sorted</strong>. </li>
447-
<li> Optional: <strong>Sort-Identifier</strong> CSS or XPath Selector relative to Element-to-Sort matching a child annotated tag <strong>containing the text to base the sorting on.</strong> </li>
448-
<li> Elements are sorted in ascending order. </li>
449-
<li> XPath: Begin selector with forward-slashes or explicitly prefix with <code>xpath:</code>. </li>
450-
</ul>
451-
</div>
436+
{{ render_field(form.annotated_sort_selectors, rows=4, placeholder='{example}
437+
{article}{date}
438+
{article}{title}') }}
439+
<span class="pure-form-message-inline">Sorting annotated text allows reordering annotated XML elements, e.g. sorting a list of articles by title or date <code>&lt;article&gt;&lt;title&gt;Title&lt;/title&gt;&lt;date&gt;1970-01-01&lt;/date&gt;&lt;/article&gt;</code>.<br>
440+
<span data-target="#advanced-help-annotated-sort" class="toggle-show pure-button button-tag button-xsmall">Show advanced help and tips</span><br>
441+
<ul id="advanced-help-annotated-sort" style="display: none;">
442+
<li> Enter one rule per line:</li>
443+
<ul>
444+
<li><code>{&lt;Element-to-Sort Selector&gt;}</code> sorts annotated XML element by its own text. </li>
445+
<li><code>{&lt;Element-to-Sort Selector&gt;}{&lt;Sort-Identifier Selector&gt;}</code> sorts annotated XML element by a child element's text. </li>
446+
</ul>
447+
<li> Use CSS or XPath selectors to match annotated tags. </li>
448+
<ul>
449+
<li>XPath: Begin selector with forward-slashes or explicitly prefix with <code>xpath:</code></li>
450+
</ul>
451+
<li> Elements are sorted in ascending order of the matched element's text. </li>
452+
</ul>
453+
</span>
452454
</div>
453455
</fieldset>
454456
<fieldset>

0 commit comments

Comments
 (0)