Skip to content

Commit bd246e0

Browse files
committed
fix: missing dot . in PHRASE searches
1 parent 8c4aecf commit bd246e0

File tree

5 files changed

+111
-52
lines changed

5 files changed

+111
-52
lines changed

README.md

Lines changed: 41 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -4,16 +4,18 @@
44
##### !! warning !! experimental and unstable
55

66
## overview
7-
ArangoSearch provides a high-level API for interacting with Arango Search Views
7+
ArangoSearch provides a low-level API for interacting with Arango Search Views
88
through the Arango Query Language (AQL). This library aims to provide a query
99
parser and AQL query builder to enable full boolean search operations across
1010
all available Arango Search View capabilities, including, `PHRASE` and
1111
`TOKENS` operations. With minimal syntax overhead the user can generate
12-
multi-lingual and language-specific, complex phrase, proximity and tokenized
12+
multi-lingual and language-specific, complex phrase, (proximity... TBD) and tokenized
1313
search terms.
1414

1515
For example, passing a search phrase like: `+mandatory -exclude ?"optional
16-
phrase"` to `buildAQL`, will produce the following query:
16+
phrase"` to `buildAQL`'s query object as the `term` key, will produce the
17+
following query:
18+
1719
```aql
1820
FOR doc IN search_view
1921
@@ -26,7 +28,7 @@ FOR doc IN search_view
2628
ANALYZER(
2729
TOKENS(@value0, @value1)
2830
ALL IN doc.@value2, @value1),
29-
@value3) AND @value4)
31+
@value3) AND @value4) <--------------- @value4 here is the PHRASE search
3032
3133
AND
3234
@@ -50,7 +52,7 @@ documents.
5052

5153
## setup
5254

53-
1) running generated AQL queries will require a working arangodb instance. in
55+
1) running generated AQL queries will require a working arangodb instance. In
5456
the future, it is hoped that this package can be imported and used in the
5557
`arangosh`, as well as client and server side. Currently there is only limited
5658
support for server-side use.
@@ -61,39 +63,42 @@ support for server-side use.
6163
!! significantly in the short-term
6264
1) clone this repository in your es6 compatible project.
6365
2) run `yarn install` from the project directory.
66+
3) unless you are importing into a typescript project, i believe you will have
67+
to run `yarn tsc` from the project directory, and possibly change the compile
68+
target in `tsconfig.json`
6469

6570
## usage
66-
for better documentation, run `yarn doc && serve docs/` from the project
67-
directory root.
71+
__for up-to-date documentation, run `yarn doc && serve docs/` from the project
72+
directory root.__
6873

69-
AQLqueryBuilder aims to provide collection-agnostic and language-agnostic
70-
boolean search capabilities to the library's user. Currently, this library
71-
makes a number of assumptions about the way your data is stored and indexed,
72-
but these are hopefully compatible with a wide range of setups.
74+
AQLqueryBuilder aims to provide cross-collection and cross-language boolean
75+
search capabilities to the library's user. Currently, this library makes a
76+
number of assumptions about the way your data is stored and indexed, but these
77+
are hopefully compatible with a wide range of setups.
7378

7479
The primary assumption this library makes is that the data you are trying to
7580
query against is indexed by an ArangoSearch View, and that all documents index
76-
the same exact field. This field can be indexed by any number of analyzers,
77-
and the search be will run against all supplied collections simultaneously. This
78-
allows for true multi-language search provided that each collection is
79-
restricted to just one language and all documents index the same key as all
80-
other documents in the view. While there are plans to expand on this
81-
functionality to provide multi-key search, this library is primarily built for
82-
academic and textual searches, and is ideally suited for documents like books,
83-
articles, and other media where most of the data resides in a single place.
81+
the same exact field. This field, passed to the builder as a key on the
82+
`query` object passed to e.g. `buildAQL()`, can be indexed by any number of
83+
analyzers, and the query will target all supplied collections simultaneously.
84+
This allows for true multi-language search provided all documents index the
85+
same key as all other documents in the view. While there are plans to expand
86+
on this functionality to provide multi-key search, this library is primarily
87+
built for academic and textual searches, and is ideally suited for documents
88+
like books, articles, and other media where most of the data resides in a
89+
single place, i.e. document `key`, or `field`.
8490

8591
This works best as a document query tool. Leveraging ArangoSearch's built-in
8692
language stemming analyzers allows for complex search phrases to be run
8793
against any number of language-specific collections simultaneously.
8894

89-
For an example of a multi-lingual document ingest/parser, please see
95+
For an example of a multi-lingual document ingest/parser/indexer, please see
9096
[ptolemy's curator](https://gitlab.com/HP4k1h5/nineveh/-/tree/master/ptolemy/dimitri/curator.js)
9197

9298
__Example:__
9399
```javascript
94100
import {buildAQL} from 'path/to/AQLqueryBuilder'
95-
const queryObject =
96-
{
101+
const queryObject = {
97102
"view": "the_arango-search_view-name",
98103
"collections": [{
99104
"name": "collection_name",
@@ -103,6 +108,7 @@ const queryObject =
103108
}
104109
const aqlQuery = buildAQL(queryObject)
105110
// ... const cursor = await db.query(aqlQuery)
111+
// ... const cursor = await db.query(aqlQuery, {start:20, end:40})
106112
```
107113
`collections` is an array of `collection` objects. This allows searching and
108114
filtering across collections impacted by the search.
@@ -149,12 +155,7 @@ Example:
149155
"op": ">",
150156
"val": 0
151157
}
152-
],
153-
"limit":
154-
{
155-
"start": 0,
156-
"end": 20,
157-
}
158+
]
158159
}
159160
```
160161

@@ -166,15 +167,15 @@ Quoting [mit's Database Search Tips](https://libguides.mit.edu/c.php?g=175963&p=
166167

167168
#### `+` AND
168169
* Mandatory terms and phrases. All results MUST INCLUDE these terms and
169-
* phrases.
170+
phrases.
170171
#### `?` OR
171-
* Optional terms and phrases. If there are ANDS or NOTS, these serve as
172-
* match score "boosters". If there are no ANDS or NOTS, ORS become required
173-
* in results.
172+
* Optional terms and phrases. If there are ANDS or NOTS, these serve as match
173+
score "boosters". If there are no ANDS or NOTS, ORS become required in
174+
results.
174175
#### `-` NOT
175176
* Search results MUST NOT INCLUDE these terms and phrases. If a result that
176-
* would otherwise have matched, contains one or more terms or phrases, it
177-
* will not be included in the result set.
177+
would otherwise have matched, contains one or more terms or phrases, it will
178+
not be included in the result set.
178179

179180
### default query syntax
180181
for more information on boolean search logic see
@@ -199,10 +200,12 @@ follows:
199200
| PHRASE | | | "buckle my shoe" |
200201
| TOKENS | two | one | |
201202

202-
The generated AQL query, when run will bring back only results that contain
203-
"two", that do not contain variations on the phrase "buckle my shoe", and that
204-
optionally contain "one". In this case, documents that contain "one" will be
205-
likely to score higher than those that do not.
203+
The generated AQL query, when run, will bring back only results that contain
204+
"two", that do not contain the phrase "buckle my shoe", and that optionally
205+
contain "one". In this case, documents that contain "one" will be likely to
206+
score higher than those that do not.
206207

207208
## bugs
209+
plase see [bugs](https://github.com/HP4k1h5/AQLqueryBuilder.js/issues/new?assignees=HP4k1h5&labels=bug&template=bug_report.md&title=basic)
208210
## contributing
211+
plase see [./.github/CONTRIBUTING.md]

src/index.ts

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,9 @@ import { query } from './lib/structs'
33
import { buildSearch } from './search'
44
import { buildFilters } from './filter'
55

6+
/** @returns an AQL query object. See @param query for details on required
7+
* values. @parm query .terms accepts either a string to be parsed or an array of @param term
8+
* */
69
export function buildAQL(query: query, limit: any = { start: 0, end: 20 }): any {
710
validateQuery(query)
811

@@ -17,7 +20,6 @@ export function buildAQL(query: query, limit: any = { start: 0, end: 20 }): any
1720
LIMIT ${limit.start}, ${limit.end}
1821
RETURN doc`
1922
}
20-
exports.buildAQL = buildAQL
2123

2224
function validateQuery(query: query) {
2325
if (!query.view.length) throw Error('query.view must be a valid ArangoSearch View name')

src/lib/structs.ts

Lines changed: 51 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,43 @@
1+
/**
2+
* passed to buildAQL, i.e. `let generatedAQL = buildAQL(query)`. Properties
3+
* mimic or match those familiar to AQL.
4+
* */
15
export interface query {
26
/**
3-
* the name of the ArangoSearch view the query will be run against
7+
* the name of the ArangoSearch View the query will be run against
48
* */
59
view: string,
610
/**
7-
* the names of the collections indexed by @view to query
11+
* the names of the collections indexed by @param view to query
812
* */
913
collections: collection[],
1014
/**
11-
* either an array of @term interfaces or a string to be parsed by @parseQuery
15+
* either an array of @param term interfaces or a string to be parsed by `parseQuery()`
1216
* */
1317
terms: term[] | string,
1418
/**
15-
* the name of the document key to search, must be the same across all
16-
* documents
19+
* the name of the document key to search, currently must be the same across
20+
* all documents. @default "text"
1721
* */
1822
key?: string,
1923
/**
20-
* a list of @filter interfaces
24+
* a list of @filter interfaces. All filters are implicitly AND'ed together.
2125
* */
2226
filters?: filter[],
2327
}
2428

29+
/**
30+
* Each collection referenced by the ArangoSearch that the user wishes to
31+
* include in the query must be listed as a collection of the following shape.
32+
*
33+
* A collection can be referenced by several analyzers and each must have its
34+
* own entry in `query.collections` in order to be included in the search.
35+
*
36+
* Alternatively, a document can be stored in several collections.
37+
*
38+
* In either case all desired collection/analyzer combinations must be
39+
* specified.
40+
* */
2541
export interface collection {
2642
/**
2743
* the name of the collection
@@ -33,14 +49,43 @@ export interface collection {
3349
analyzer: string,
3450
}
3551

52+
/**
53+
* A piece of search query text. Can be a phrase or an individual word, and will
54+
* belong to one cell or other of the following grid:
55+
* ```
56+
* **ops**
57+
* | | ANDS | ORS | NOTS |
58+
* | ----- | ---- | --- | ----- |
59+
* **types** | PHRASE | | | |
60+
* | TOKENS | | | |
61+
* | PROXIM | TODO | TODO| TODO |
62+
* ```
63+
* */
3664
export interface term {
65+
/**
66+
* must be one of [ 'phr', 'tok' ], corresponding to `PHRASE` and
67+
* `TOKENS` respectively.
68+
**/
3769
type: string,
70+
/**
71+
* the search string
72+
* */
3873
val: string,
74+
/**
75+
* must be one of [ '+', '?', '-' ] corresponding to `ANDS`, `ORS`, and
76+
* `NOTS`, respectively.
77+
* */
3978
op: string,
4079
}
4180

81+
/**
82+
* passed to AQL `FILTER`
83+
* */
4284
export interface filter {
85+
/** the arango document field name to filter on */
4386
field: string,
87+
/** the high-level operator to filter by */
4488
op: string,
89+
/** the query string to filter with */
4590
val: string | number | Date,
4691
}

src/search.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ function buildOPS(collections: collection[], terms: term[], op: string, key:
6565

6666
function buildPhrase(phrase: term, collections: collection[], key: string): any {
6767
return collections.map(coll => {
68-
return aql`PHRASE(doc${key}, ${phrase.val.slice(1, -1)}, ${coll.analyzer})`
68+
return aql`PHRASE(doc.${key}, ${phrase.val.slice(1, -1)}, ${coll.analyzer})`
6969
})
7070
}
7171

tests/search.ts

Lines changed: 15 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,22 +6,31 @@ describe('search.js', () => {
66
expect(buildSearch).to.be.a('function')
77
})
88

9-
it('should return SEARCH true when terms: is an empty string', () => {
10-
const query = { view: 'search_view', collections: [ { name: 'coll', analyzer: 'text_en' } ], terms: '' }
11-
const builtSearch = buildSearch(query)
9+
it(`should return SEARCH true
10+
when terms: is an empty string or array`, () => {
11+
12+
/* empty string */
13+
let empty_string_query = { view: 'search_view', collections: [ { name: 'coll', analyzer: 'text_en' } ], terms: '' }
14+
const builtSearch = buildSearch(empty_string_query)
1215

1316
expect(builtSearch).to.be.an('object')
1417
expect(Object.keys(builtSearch.bindVars)).to.have.length(2)
1518
expect(builtSearch.bindVars.value0).to.equal(true)
1619
expect(builtSearch.bindVars.value1).to.deep.equal({ collections: [ 'coll' ] })
20+
21+
/* empty array */
22+
let empty_array_query = { view: 'search_view', collections: [ { name: 'coll', analyzer: 'text_en' } ], terms: [] }
23+
const builtSearch_from_array = buildSearch(empty_array_query)
24+
25+
expect(builtSearch_from_array.bindVars.value0).to.equal(true)
1726
})
1827

1928
it('should return an array of aql objects', () => {
2029
const query = { view: 'search_view', collections: [ { name: 'coll', analyzer: 'text_en' } ], terms: '-a +"query string" ?token' }
2130
const builtSearch = buildSearch(query)
2231

2332
expect(Object.keys(builtSearch.bindVars)).to.have.length(7)
24-
expect(builtSearch.bindVars.value0[ 0 ].query).to.equal('PHRASE(doc.text, @value0, @value1)')
33+
expect(builtSearch.bindVars.value0[ 0 ].query).to.equal('PHRASE(doc.@value0, @value1, @value2)')
2534
expect(builtSearch.bindVars.value1).to.deep.equal('token')
2635
expect(builtSearch.bindVars.value2).to.deep.equal('text_en')
2736
expect(builtSearch.bindVars.value3).to.deep.equal('text')
@@ -48,9 +57,9 @@ describe('search.js', () => {
4857
SORT TFIDF(doc) DESC`)
4958
})
5059

51-
it('should return an array of aql objects', () => {
60+
it.skip('should return an array of aql objects', () => {
5261
const query = { view: 'search_view', collections: [ { name: 'coll', analyzer: 'text_en' } ], terms: '+mandatory -exclude ?"optional phrase"' }
5362
const builtSearch = buildSearch(query)
5463
expect(builtSearch.query).to.equal(``)
55-
}).skip()
64+
})
5665
})

0 commit comments

Comments
 (0)