Skip to content

Commit 186f263

Browse files
authored
Merge pull request #5 from HP4k1h5/v0.0.4
V0.0.4
2 parents 2c5d924 + 47c65b2 commit 186f263

File tree

12 files changed

+861
-475
lines changed

12 files changed

+861
-475
lines changed

.github/CONTRIBUTING.md

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,19 @@ When contributing code, please:
2727
and should be the only version branch available. Don't hesitate to ask if it
2828
is unclear.
2929
4) make changes
30-
5) add tests; currently using mochai/chai
31-
run tests with e.g. `yarn test tests/glob`
32-
6) add comments to your functions, and if possible, in the
30+
5) add tests; using mochai/chai
31+
run tests with e.g. `yarn test tests/glob`
32+
or
33+
`yarn tests` to run the suite
34+
see [testing](#testing) for more information.
35+
6) add comments, and if possible, in the
3336
[typedoc](https://github.com/TypeStrong/typedoc) style
3437
7) submit a merge request from your forked branch into the
35-
latest HP4k1h5/ephemeris `v.X.X.X` branch.
38+
latest HP4k1h5/AQLqueryBuilder `v.X.X.X` branch.
39+
40+
41+
### testing
42+
43+
all tests that require a live arango instance are run with root:"" no
44+
permissions on `localhost:8529`. This can be modified in test files that
45+
require db access.

.prettierrc

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
{
2+
"trailingComma": "all",
3+
"tabWidth": 2,
4+
"semi": false,
5+
"singleQuote": true,
6+
}

README.md

Lines changed: 75 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -12,43 +12,57 @@ all available Arango Search View capabilities, including, `PHRASE` and
1212
multi-lingual and language-specific, complex phrase, (proximity... TBD) and tokenized
1313
search terms.
1414

15-
For example, passing a search phrase like: `+mandatory -exclude ?"optional
16-
phrase"` to `buildAQL`'s query object as the `term` key, will produce a query
17-
like the following:
18-
19-
```aql
20-
FOR doc IN search_view
21-
SEARCH
22-
MIN_MATCH(
23-
ANALYZER(
24-
TOKENS(@value0, @value1)
25-
ALL IN doc.@value2, @value1),
26-
@value3) OR (MIN_MATCH(
27-
ANALYZER(
28-
TOKENS(@value0, @value1)
29-
ALL IN doc.@value2, @value1),
30-
@value3) AND (PHRASE(doc.@value2, @value4, @value1)))
31-
32-
AND
33-
34-
MIN_MATCH(
35-
ANALYZER(
36-
TOKENS(@value5, @value1)
37-
NONE IN doc.@value2, @value1),
38-
@value3)
39-
40-
OPTIONS @value6
41-
SORT TFIDF(doc) DESC
42-
43-
LIMIT @value7, @value8
44-
RETURN doc
15+
For example, passing a search phrase like: `some +words -not +"phrase search"
16+
-"not these" ?"could have"` to `buildAQL`'s query object as the `term` key,
17+
will produce a query like the following:
18+
19+
```c
20+
FOR doc IN view
21+
22+
SEARCH
23+
(PHRASE(doc.text, "phrase search", analyzer)) AND MIN_MATCH(
24+
ANALYZER(
25+
TOKENS("words", analyzer)
26+
ALL IN doc.text, analyzer),
27+
1) OR ((PHRASE(doc.text, "phrase search", analyzer)) AND MIN_MATCH(
28+
ANALYZER(
29+
TOKENS("words", analyzer)
30+
ALL IN doc.text, analyzer),
31+
1) AND (PHRASE(doc.text, "could have", analyzer)) OR MIN_MATCH(
32+
ANALYZER(
33+
TOKENS(other, analyzer)
34+
ANY IN doc.text, analyzer),
35+
1))
36+
37+
AND
38+
NOT (PHRASE(doc.text, "not these", analyzer))
39+
AND MIN_MATCH(
40+
ANALYZER(
41+
TOKENS("nor", analyzer)
42+
NONE IN doc.text, analyzer),
43+
1)
44+
45+
OPTIONS {"collections": ["col"]}
46+
SORT TFIDF(doc) DESC
47+
48+
LIMIT "phrase search"0, "phrase search"1
49+
RETURN doc`
4550
```
51+
n.b. the above code block is sytled with c but is .aql compatible.
52+
4653
This query will retrieve all documents that __include__ the term "mandatory"
4754
AND __do not include__ the term "exclude", AND whose ranking will be boosted by the
4855
presence of the phrase "optional phrase". If no mandatory or exclude terms are
4956
provided, optional terms are considered required, so as not to retrieve all
5057
documents.
5158
59+
See [default query syntax](#default-query-syntax) and this schematic
60+
[example](#example) for more details.
61+
62+
If multiple collections are passed, the above queried is essentially
63+
replicated across all collections, see examples in 'tests/cols.ts'. In the
64+
future this will also accommodate multiple key searches.
65+
5266
## setup
5367
5468
1) running generated AQL queries will require a working arangodb instance. In
@@ -107,7 +121,7 @@ const queryObject = {
107121
}
108122
const aqlQuery = buildAQL(queryObject)
109123
// ... const cursor = await db.query(aqlQuery)
110-
// ... const cursor = await db.query(aqlQuery, {start:20, end:40})
124+
// ... const cursor = await db.query(buildAQL(queryObject, {start:20, end:40})
111125
```
112126
`collections` is an array of `collection` objects. This allows searching and
113127
filtering across collections impacted by the search.
@@ -159,24 +173,33 @@ Example:
159173
```
160174

161175
### boolean search logic
176+
162177
Quoting [mit's Database Search Tips](https://libguides.mit.edu/c.php?g=175963&p=1158594):
178+
163179
> Boolean operators form the basis of mathematical sets and database logic.
164180
They connect your search words together to either narrow or broaden your
165181
set of results. The three basic boolean operators are: AND, OR, and NOT.
166182

167183
#### `+` AND
184+
168185
* Mandatory terms and phrases. All results MUST INCLUDE these terms and
169186
phrases.
187+
170188
#### `?` OR
189+
171190
* Optional terms and phrases. If there are ANDS or NOTS, these serve as match
172191
score "boosters". If there are no ANDS or NOTS, ORS become required in
173192
results.
193+
174194
#### `-` NOT
195+
175196
* Search results MUST NOT INCLUDE these terms and phrases. If a result that
176197
would otherwise have matched, contains one or more terms or phrases, it will
177-
not be included in the result set.
198+
not be included in the result set. If there are no required or optional
199+
terms, all results that do NOT match these terms will be returned.
178200

179201
### default query syntax
202+
180203
for more information on boolean search logic see
181204
[above](#boolean-search-logic)
182205

@@ -190,7 +213,7 @@ by one of the following symbols `+ ? -`, or the plus-sign, the question-mark,
190213
and the minus-sign. If a word has no operator prefix, it is considered
191214
optional and is counted as an `OR`.
192215

193-
Example:
216+
#### Example
194217
input `one +two -"buckle my shoe"` and the queryParser will interpret as
195218
follows:
196219

@@ -204,7 +227,26 @@ The generated AQL query, when run, will bring back only results that contain
204227
contain "one". In this case, documents that contain "one" will be likely to
205228
score higher than those that do not.
206229

230+
When the above phrase `one +two -"buckle my shoe"` is run against the
231+
following documents:
232+
233+
```boxcar
234+
┏━━━━━━━━━━━━━━━━━━┓ ┏━━━━━━━━━━━━━━━━━━┓ ┏━━━━━━━━━━━━━━━━━━┓
235+
┃ Document A ┃ ┃ Document B ┃ ┃ Document C ┃
236+
┃ ---------- ┃ ┃ ┃ ┃ ┃
237+
┃ ┃ ┃ three four ┃ ┃ one ┃
238+
┃ one two ┃ ┃ ┃ ┃ ┃
239+
┃ ┃ ┃ and two ┃ ┃ ┃
240+
┃ buckle my shoe┃ ┃ ┃ ┃ ┃
241+
┗━━━━━━━━━━━━━━━━━━┛ ┗━━━━━━━━━━━━━━━━━━┛ ┗━━━━━━━━━━━━━━━━━━┛
242+
```
243+
244+
only Document B is returned;
245+
Document A is excluded by the phrase "buckle my shoe"
246+
Document C does not contain the mandatory word "two"
247+
207248
## bugs
208249
plase see [bugs](https://github.com/HP4k1h5/AQLqueryBuilder.js/issues/new?assignees=HP4k1h5&labels=bug&template=bug_report.md&title=basic)
209250
## contributing
210251
plase see [./.github/CONTRIBUTING.md]
252+

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@hp4k1h5/aqlquerybuilder.js",
3-
"version": "0.0.3",
3+
"version": "0.0.4",
44
"license": "MIT",
55
"main": "./built/index.d.ts",
66
"scripts": {

src/index.ts

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,13 @@ import { buildSearch } from './search'
44
import { buildFilters } from './filter'
55

66
/** @returns an AQL query object. See @param query for details on required
7-
* values. @parm query .terms accepts either a string to be parsed or an array of @param term
7+
* values. @parm query .terms accepts either a string to be parsed or an array
8+
* of @param term
89
* */
9-
export function buildAQL(query: query, limit: any = { start: 0, end: 20 }): any {
10+
export function buildAQL(
11+
query: query,
12+
limit: any = { start: 0, end: 20 },
13+
): any {
1014
validateQuery(query)
1115

1216
const SEARCH = buildSearch(query)
@@ -22,6 +26,8 @@ export function buildAQL(query: query, limit: any = { start: 0, end: 20 }): any
2226
}
2327

2428
function validateQuery(query: query) {
25-
if (!query.view.length) throw Error('query.view must be a valid ArangoSearch View name')
26-
if (!query.collections.length) throw new Error('query.collections must have at least one name')
29+
if (!query.view.length)
30+
throw new Error('query.view must be a valid ArangoSearch View name')
31+
if (!query.collections.length)
32+
throw new Error('query.collections must have at least one name')
2733
}

src/parse.ts

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
import {term} from './lib/structs'
1+
import { term } from './lib/structs'
22

33
export function parseQuery(queryString: string): term[] {
44
const queryRgx: RegExp = /[+?-]?(["'(]).+?(\1|\))|[^"'()\s]+/g
@@ -9,12 +9,12 @@ export function parseQuery(queryString: string): term[] {
99
return matches.map(match => {
1010
/* strip op */
1111
let op = '?'
12-
if (/[+?-]/.test(match[0])) {
13-
op = match[0]
12+
if (/[+?-]/.test(match[ 0 ])) {
13+
op = match[ 0 ]
1414
match = match.substring(1)
1515
}
1616

17-
if (match[0] == '"' || match[0] == "'") {
17+
if (match[ 0 ] == '"' || match[ 0 ] == "'") {
1818
return {
1919
type: 'phr',
2020
val: match,

src/search.ts

Lines changed: 49 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,10 @@ import { aql } from 'arangojs'
22
import { query, collection, term } from './lib/structs'
33
import { parseQuery } from './parse'
44

5-
65
export function buildSearch(query: query): any {
76
/* parse string query */
8-
query.terms = typeof query.terms == 'string'
9-
? parseQuery(query.terms)
10-
: query.terms
7+
query.terms =
8+
typeof query.terms == 'string' ? parseQuery(query.terms) : query.terms
119

1210
/* build boolean pieces */
1311
let ANDS = buildOPS(query.collections, query.terms, '+', query.key)
@@ -22,7 +20,9 @@ export function buildSearch(query: query): any {
2220
if (!!NOTS) {
2321
NOTS = aql`${ANDS || ORS ? aql.literal(' AND ') : undefined}
2422
${NOTS.phrases ? aql.literal(' NOT ') : undefined} ${NOTS.phrases}
25-
${NOTS.phrases && NOTS.tokens ? aql.literal(' AND ') : undefined} ${NOTS.tokens}`
23+
${NOTS.phrases && NOTS.tokens ? aql.literal(' AND ') : undefined} ${
24+
NOTS.tokens
25+
}`
2626
}
2727

2828
/* if an empty query.terms string or array is passed, SEARCH true, bringing
@@ -33,25 +33,24 @@ export function buildSearch(query: query): any {
3333
${ORS}
3434
${NOTS}
3535
${(!ANDS && !ORS && !NOTS) || undefined}
36-
OPTIONS ${{ collections: query.collections.map(c => c.name) }}
36+
OPTIONS ${{ collections: query.collections.map((c) => c.name) }}
3737
SORT TFIDF(doc) DESC`
3838
}
3939

40-
function buildOPS(collections: collection[], terms: term[], op: string, key:
41-
string = 'text'): any {
40+
function buildOPS(
41+
collections: collection[],
42+
terms: term[],
43+
op: string,
44+
key: string = 'text',
45+
): any {
4246
const opWord: string = op == '+' ? ' AND ' : ' OR '
4347

4448
let queryTerms: any = terms.filter((t: term) => t.op == op)
4549
if (!queryTerms.length) return
4650

4751
/* phrases */
4852
let phrases = queryTerms.filter((qT: term) => qT.type == 'phr')
49-
.map((phrase: any) => buildPhrase(phrase, collections, key))
50-
if (!phrases.length) {
51-
phrases = undefined
52-
} else {
53-
phrases = aql.join(phrases, opWord)
54-
}
53+
phrases = buildPhrases(phrases, collections, key, opWord)
5554

5655
/* tokens */
5756
let tokens = queryTerms.filter((qT: { type: string }) => qT.type === 'tok')
@@ -60,17 +59,39 @@ function buildOPS(collections: collection[], terms: term[], op: string, key:
6059
if (!phrases && !tokens) return
6160
if (op == '-') return { phrases, tokens }
6261
if (phrases && tokens) return aql.join([ phrases, tokens ], opWord)
63-
return (tokens || phrases)
62+
return tokens || phrases
63+
}
64+
65+
function buildPhrases(
66+
phrases: term[],
67+
collections: collection[],
68+
key: string,
69+
opWord: string,
70+
): any {
71+
if (!phrases.length) return undefined
72+
73+
return aql.join(
74+
phrases.map((phrase: any) => buildPhrase(phrase, collections, key)),
75+
opWord,
76+
)
6477
}
6578

66-
function buildPhrase(phrase: term, collections: collection[], key: string): any {
67-
const phrases = collections.map(coll => {
79+
function buildPhrase(
80+
phrase: term,
81+
collections: collection[],
82+
key: string,
83+
): any {
84+
const phrases = collections.map((coll) => {
6885
return aql`PHRASE(doc.${key}, ${phrase.val.slice(1, -1)}, ${coll.analyzer})`
6986
})
7087
return aql`(${aql.join(phrases, ' OR ')})`
7188
}
7289

73-
function buildTokens(tokens: term[], collections: collection[], key: string): any {
90+
function buildTokens(
91+
tokens: term[],
92+
collections: collection[],
93+
key: string,
94+
): any {
7495
if (!tokens.length) return
7596

7697
const opWordMap = {
@@ -85,19 +106,24 @@ function buildTokens(tokens: term[], collections: collection[], key: string): an
85106
return a
86107
}, {})
87108

88-
const makeTokenAnalyzers = (tokens: term[], op: string, analyzer: string,
89-
key: string) => {
109+
const makeTokenAnalyzers = (
110+
tokens: term[],
111+
op: string,
112+
analyzer: string,
113+
key: string,
114+
) => {
90115
return aql`
91116
ANALYZER(
92117
TOKENS(${tokens}, ${analyzer})
93118
${aql.literal(op)} IN doc.${key}, ${analyzer})`
94119
}
95120

96121
let remapped = []
97-
collections.forEach(coll => {
122+
collections.forEach((coll) => {
98123
remapped.push(
99-
...Object.keys(mapped).map(op => makeTokenAnalyzers(mapped[ op ], op,
100-
coll.analyzer, key))
124+
...Object.keys(mapped).map((op) =>
125+
makeTokenAnalyzers(mapped[ op ], op, coll.analyzer, key),
126+
),
101127
)
102128
})
103129

0 commit comments

Comments
 (0)