Skip to content

Commit 85de4ca

Browse files
authored
docs: Finish docs and tests for regex searches (#1051)
* docs,test: Add examples of searching for time stamps * test: Add tests to show the text that is searched by description * docs: Document how description's search text is constructed. * test: Search description for short tags, excluding sub-tags It's more complicated than I expected! Roll on regex searching of tags * test: Search description for sub-tags * test: Simplify regexes by using String.raw * test: Add missing 'it' block, for consistency * test: More useful example for sub-tag search And remove irrelevant comment * docs: Document searching short tags and sub-tags Roll on regex searching in tag and tags instructions. * docs: Be consistent with case-insensitive flag * docs: Heading regex searches are also case-sensitive by default * docs: Add a regex demo to examples.md * test: Use String.raw for all regex searches This makes it more likely that anyone added a future test that requires an escape character will use this form, instead of using the less readable '\\'. * docs: Explain what [...] means * docs: Add 'Escaping special characters' section * docs,vault: Link between regex docs and sample page The links won't work until both this PR is merged, and the next release is done. * docs: Improve the 'Important links' section * docs: Link to tickets in 'known limitations' section * test: Add regex demo of an example used for Boolean searches * docs: Document pipe (|) in regex - and add example * docs: Remove stray URL
1 parent 6a183d0 commit 85de4ca

File tree

8 files changed

+340
-15
lines changed

8 files changed

+340
-15
lines changed

docs/queries/examples.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -95,3 +95,12 @@ All open tasks that are due today or earlier, sorted by due date, then grouped t
9595
sort by due
9696
group by folder
9797
```
98+
99+
---
100+
101+
All open tasks that begin with a time stamp in `HH:mm` format, followed by any white space character:
102+
103+
```tasks
104+
not done
105+
description regex matches /^[012][0-9]:[0-5][0-9]\s/
106+
```

docs/queries/filters.md

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,21 @@ As well as the date-related searches above, these filters search other propertie
140140

141141
> `regex matches` and `regex does not match` were introduced in Tasks 1.12.0.
142142
143+
For precise searches, it may help to know that `description`:
144+
145+
- first removes all each task's signifier emojis and their values,
146+
- then removes any global filter,
147+
- then removes an trailing spaces
148+
- and then searches the remaining text
149+
150+
For example:
151+
152+
| Global Filter | Task line | Text searched by `description` |
153+
| ---------------- | ------------------------------------------------------------------------ | -------------------------------- |
154+
| No global filter | `'- [ ] Do stuff ⏫ #tag1 ✅ 2022-08-12 #tag2/sub-tag '` | `'Do stuff #tag1 #tag2/sub-tag'` |
155+
| `#task` | `'- [ ] #task Do stuff ⏫ #tag1 ✅ 2022-08-12 #tag2/sub-tag '` | `'Do stuff #tag1 #tag2/sub-tag'` |
156+
| `global-filter` | `'- [ ] global-filter Do stuff ⏫ #tag1 ✅ 2022-08-12 #tag2/sub-tag '` | `'Do stuff #tag1 #tag2/sub-tag'` |
157+
143158
### Priority
144159

145160
- `priority is (above|below)? (low|none|medium|high)`
@@ -209,7 +224,7 @@ These filters allow searching for tasks in particular files and sections of file
209224
- `does not include` will match a task that does not have a preceding heading in its file.
210225
- Matches case-insensitive (disregards capitalization).
211226
- `heading (regex matches|regex does not match) /<JavaScript-style Regex>/`
212-
- Whether or not the heading preceding the task includes the given regular expression.
227+
- Whether or not the heading preceding the task includes the given regular expression (case-sensitive by default).
213228
- Always tries to match the closest heading above the task, regardless of heading level.
214229
- `regex does not match` will match a task that does not have a preceding heading in its file.
215230
- Essential reading: [Regular Expression Searches]({{ site.baseurl }}{% link queries/regular-expressions.md %}).

docs/queries/regular-expressions.md

Lines changed: 114 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -78,27 +78,54 @@ description regex matches /pc_abigail|pc_edwina|at_work/i
7878
- Note that `tags` searches do not yet support regex searches.
7979
- As a workaround, search `description` instead.
8080

81+
## Escaping special characters
82+
83+
To search for any of the characters `[ \ ^ $ . | ? * + ( ) /` literally in Tasks, you should put a `\` character before each of them.
84+
85+
This is called 'escaping'. See [Escaping, special characters](https://javascript.info/regexp-escaping).
86+
87+
See the next section for the meaning of some of these characters.
88+
8189
## Special characters
8290

8391
If using regex searches, it is important to be aware of the available special characters for several reasons:
8492

8593
1. They enable complex queries to written in simple ways
8694
2. They can cause confusing results or broken searches, if not "escaped" in the search.
8795

88-
Here are a few examples of the [many special characters](https://www.rexegg.com/regex-quickstart.html):
96+
Here are a few examples of the [many special characters](https://javascript.info/regexp-escaping):
8997

9098
- `.` matches any character
91-
- `^` matches the start of the string (but when `[^inside brackets]`, it means "not")
92-
- `$` matches the end of the string
99+
- `[...]` means search for any of the characters in the square brackets.
100+
- For example, `[aeiou]` will match any of an `a`, `e`, `i`, `o` or `u`.
101+
- See [Sets and ranges \[...\]](https://javascript.info/regexp-character-sets-and-ranges)
102+
- Start and end
103+
- `^` matches the start of the string (but when `[^inside brackets]`, it means "not")
104+
- `$` matches the end of the string
105+
- See [Anchors: string start ^ and end $](https://javascript.info/regexp-anchors)
106+
- `|` is an `OR` in regular expressions
107+
- See [Alternation (OR) |](https://javascript.info/regexp-alternation)
93108
- `\` adds special meaning to some characters. For example:
94109
- `\d` matches one digit, from 0 to 9
95110
- `\D` matches character that is not a digit
111+
- See [Character classes](https://javascript.info/regexp-character-classes)
112+
113+
For a thorough, clear introduction to all the options, see [Regular expressions](https://javascript.info/regular-expressions) at JavaScript.info.
96114

97115
## Important links
98116

117+
Learning resources:
118+
119+
- [Regular expressions](https://javascript.info/regular-expressions) at JavaScript.info
99120
- [Regex Tutorial](https://regexone.com/)
100121
- [Regex Cheat Sheet](https://www.rexegg.com/regex-quickstart.html)
122+
123+
Online tools for experimenting with - and testing - regular expressions:
124+
101125
- [Regex Testing Tool: regex101](https://regex101.com/): Select the flavor 'ECMAScript (JavaScript)'
126+
127+
Implementation details:
128+
102129
- Implemented using [JavaScript's RegExp implementation](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions)
103130
- Supports [JavaScript RegExp Flags](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions#advanced_searching_with_flags), although not all of them are relevant in this context.
104131

@@ -109,22 +136,103 @@ Please be aware of the following limitations in Tasks' implementation of regular
109136
- The single error message `Tasks query: cannot parse regex (description); check your leading and trailing slashes for your query` may mean any of:
110137
- The opening or closing `/` is missing from the query.
111138
- The regular expression is not valid, for example `description regex matches /[123/`.
139+
- Logged in [#1038](https://github.com/obsidian-tasks-group/obsidian-tasks/issues/1038) and [#1039](https://github.com/obsidian-tasks-group/obsidian-tasks/issues/1039)
112140
- No error when part of the pattern is lost, for example because unescaped slashes are used inside the pattern.
113141
- For example, `path regex matches /a/b/c/d/` actually searches for `path regex matches /a/`.
114142
- In this case, the query should be `path regex matches /a\/b\/c\/d/`.
143+
- Logged in [#1037](https://github.com/obsidian-tasks-group/obsidian-tasks/issues/1037)
115144
- Illegal flags are ignored.
116145
- For example, the query `description regex matches /CASE/&` should give an error that `&` (and similar) are unrecognised flags.
117146
- The `tag` or `tags` instruction does not yet support regular expression searches.
147+
- Logged in [#1040](https://github.com/obsidian-tasks-group/obsidian-tasks/discussions/1040)
118148
- [Lookahead and Lookbehind](https://www.regular-expressions.info/lookaround.html) searches are untested, and are presumed not to work on Apple mobile devices, or to cause serious performance problems with slow searches.
119149

120150
## Regular expression examples
121151

122-
Example searches:
152+
Below are some example regex searches, to give some ideas of what can be done.
153+
154+
There are some more examples in the [Tasks-Demo sample vault](https://github.com/obsidian-tasks-group/obsidian-tasks/tree/main/resources/sample_vaults/Tasks-Demo), in the file [Regular Expression Searches](https://github.com/obsidian-tasks-group/obsidian-tasks/blob/main/resources/sample_vaults/Tasks-Demo/Filters/Regular%20Expression%20Searches.md).
155+
156+
### Searching the start of a field
157+
158+
Find tasks whose description begins with Log, exact capitalisation:
123159

124160
```text
125-
# Find tasks whose description begins with Log, exact capitalisation
126161
description regex matches /^Log/
162+
```
163+
164+
---
127165

128-
# Find tasks whose description begins with Log, ignoring capitalisation
166+
Find tasks whose description begins with Log, ignoring capitalisation
167+
168+
```text
129169
description regex matches /^Log/i
130170
```
171+
172+
### Finding tasks that are waiting
173+
174+
I want to find tasks that are waiting for something else. But 'waiting' can be spelled in several different ways:
175+
176+
```text
177+
description regex matches /waiting|waits|wartet/i
178+
```
179+
180+
### Finding times
181+
182+
Find tasks containing a time in the description - simple version. This matches invalid times, such as `99:99`, as `\d` means 'any digit'.
183+
184+
```text
185+
description regex matches /\d\d:\d\d/
186+
```
187+
188+
---
189+
190+
Find tasks containing a time in the description. This is more precise than the previous example, thanks to specifying which digits are allowed in each position.
191+
192+
```text
193+
description regex matches /[012][0-9]:[0-5][0-9]/
194+
```
195+
196+
### Finding sub-tags
197+
198+
Currently `tag` and `tags` searches do not yet support regular expressions. Therefore, for precise searching of tags, use `description` instead.
199+
200+
Suppose you wanted to search for tags of this form: `#tag/subtag3/subsubtag5`, where the `3` and the `5` are allowed to be any single digit.
201+
202+
- We can use either `[0-9]` or `\d` to match a single digit.
203+
- To find a sub-tag, any `/` characters must be 'escaped' to prevent them truncating the rest of the search pattern.
204+
205+
Escaping the `/` leads us to this instruction, which we have made case-insensitive to find capitalised tags too:
206+
207+
```text
208+
description regex matches /#tag\/subtag[0-9]\/subsubtag[0-9]/i
209+
```
210+
211+
### Finding short tags
212+
213+
Currently `tag` and `tags` searches do not yet support regular expressions. Therefore, for precise searching of tags, use `description` instead.
214+
215+
Suppose you wanted to search for tasks with a very short tag in: `#t`, and to not match tags line `#task` and `#t/subtag`.
216+
217+
The most general query is:
218+
219+
```text
220+
(description regex matches /#t\s/i) OR (description regex matches /#t$/i)
221+
```
222+
223+
We have made it case-insensitive to find capitalised tags too.
224+
225+
The Boolean `OR` allows us to search for two different patterns, for a thorough search:
226+
227+
- `description regex matches /#t\s/i`
228+
- Matches `#t` or `#T`, followed by any white-space character. This might be a literal space, or it might be a tab, for example.
229+
- It won't be a newline character though, as task descriptions are always exactly one line, with no newline character at the end.
230+
- For example, this will match:
231+
- `- [ ] #t Do stuff`
232+
- `- [ ] Do #t stuff`
233+
- But it will not match:
234+
- `- [ ] Do stuff #t`
235+
- `description regex matches /#t$/i`
236+
- Matches `#t` or `#T` at the very end of the task line (after all signifiers and trailing white space has been removed)
237+
- For example, this will match:
238+
- `- [ ] Do stuff #t`

resources/sample_vaults/Tasks-Demo/Filters/Regular Expression Searches.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,10 @@
11
# Regular Expression Searches
22

3+
This file contains a few examples that were useful when testing
4+
the feature to search for text using regular expressions.
5+
6+
Full documentation is available: see [Regular Expressions](https://obsidian-tasks-group.github.io/obsidian-tasks/queries/regular-expressions/).
7+
38
## Sample Tasks
49

510
- [ ] #task Do thing 1 #context/pc_abigail

src/Query/Filter/DescriptionField.ts

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,10 +15,12 @@ export class DescriptionField extends TextField {
1515

1616
/**
1717
* Return the task's description, with any global tag removed
18+
*
19+
* Promoted to public, to enable testing.
1820
* @param task
19-
* @protected
21+
* @public
2022
*/
21-
protected value(task: Task): string {
23+
public value(task: Task): string {
2224
// Remove global filter from description match if present.
2325
// This is necessary to match only on the content of the task, not
2426
// the global filter.

0 commit comments

Comments
 (0)