Skip to content

Commit caa84ab

Browse files
authored
Adition of Group Finder (#14)
* added group finder * updated readme * added information of group finder to the main README * minor fixes on documentation
1 parent 602c8ef commit caa84ab

21 files changed

+2801
-8
lines changed

BUILD.bazel

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,13 @@ load("@bazel_gazelle//:def.bzl", "gazelle")
22

33
# gazelle:prefix github.com/pedroegsilva/gofindthem
44
gazelle(name = "gazelle")
5+
6+
gazelle(
7+
name = "gazelle-update",
8+
args = [
9+
"-from_file=go.mod",
10+
"-to_macro=deps.bzl%go_dependencies",
11+
"-prune",
12+
],
13+
command = "update-repos",
14+
)

README.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ from which is heavily influenced by the [InfluxQL parser](https://github.com/inf
2222

2323
## Usage/Examples
2424

25-
There are 2 libraries on this repository, the DSL and the Finder.
25+
There are 3 libraries/packages on this repository, the DSL, the Finder and the GroupFinder.
2626

2727
### Finder
2828
The finder is used to manage multiple expressions. It will use the DSL to extract the terms and regex from each expression and use them to process the text with the appropriate engine.
@@ -80,6 +80,15 @@ And finally you can check which expressions were match on each text.
8080

8181
The full example can be found at `/examples/finder/main.go`
8282

83+
### GroupFinder
84+
The Group finder is a package that adds another DSL to improve the maintainability
85+
of the searched patterns and enables searches on specific fields of structured documents.
86+
It allows the configuration to be split into 2 categories(rules and tags) so that the tags
87+
can be used by multiple rules. The Rules also enables to check if a given tag was found on
88+
a specific field for a structured document.
89+
You can find more about the usage of the Group Finder at its [README](https://github.com/pedroegsilva/gofindthem/tree/main/group)
90+
91+
8392
### DSL
8493
#### Definition
8594
The DSL uses 5 operators (AND, OR, NOT, R, INORD), terms (defined by "") and parentheses to form expressions. A valid expression can be:

WORKSPACE

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,19 +2,19 @@ load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
22

33
http_archive(
44
name = "io_bazel_rules_go",
5-
sha256 = "8e968b5fcea1d2d64071872b12737bbb5514524ee5f0a4f54f5920266c261acb",
5+
sha256 = "685052b498b6ddfe562ca7a97736741d87916fe536623afb7da2824c0211c369",
66
urls = [
7-
"https://mirror.bazel.build/github.com/bazelbuild/rules_go/releases/download/v0.28.0/rules_go-v0.28.0.zip",
8-
"https://github.com/bazelbuild/rules_go/releases/download/v0.28.0/rules_go-v0.28.0.zip",
7+
"https://mirror.bazel.build/github.com/bazelbuild/rules_go/releases/download/v0.33.0/rules_go-v0.33.0.zip",
8+
"https://github.com/bazelbuild/rules_go/releases/download/v0.33.0/rules_go-v0.33.0.zip",
99
],
1010
)
1111

1212
http_archive(
1313
name = "bazel_gazelle",
14-
sha256 = "62ca106be173579c0a167deb23358fdfe71ffa1e4cfdddf5582af26520f1c66f",
14+
sha256 = "501deb3d5695ab658e82f6f6f549ba681ea3ca2a5fb7911154b5aa45596183fa",
1515
urls = [
16-
"https://mirror.bazel.build/github.com/bazelbuild/bazel-gazelle/releases/download/v0.23.0/bazel-gazelle-v0.23.0.tar.gz",
17-
"https://github.com/bazelbuild/bazel-gazelle/releases/download/v0.23.0/bazel-gazelle-v0.23.0.tar.gz",
16+
"https://mirror.bazel.build/github.com/bazelbuild/bazel-gazelle/releases/download/v0.26.0/bazel-gazelle-v0.26.0.tar.gz",
17+
"https://github.com/bazelbuild/bazel-gazelle/releases/download/v0.26.0/bazel-gazelle-v0.26.0.tar.gz",
1818
],
1919
)
2020

@@ -27,6 +27,6 @@ go_dependencies()
2727

2828
go_rules_dependencies()
2929

30-
go_register_toolchains(version = "1.16.7")
30+
go_register_toolchains(version = "1.18.3")
3131

3232
gazelle_dependencies()

examples/group/BUILD.bazel

Whitespace-only changes.

examples/group/finder/BUILD.bazel

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
load("@io_bazel_rules_go//go:def.bzl", "go_binary", "go_library")
2+
3+
go_library(
4+
name = "finder_lib",
5+
srcs = ["main.go"],
6+
importpath = "github.com/pedroegsilva/gofindthem/examples/group/finder",
7+
visibility = ["//visibility:private"],
8+
deps = [
9+
"//finder",
10+
"//group/finder",
11+
],
12+
)
13+
14+
go_binary(
15+
name = "finder",
16+
embed = [":finder_lib"],
17+
visibility = ["//visibility:public"],
18+
)

examples/group/finder/main.go

Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
package main
2+
3+
import (
4+
"fmt"
5+
6+
"github.com/pedroegsilva/gofindthem/finder"
7+
gfinder "github.com/pedroegsilva/gofindthem/group/finder"
8+
)
9+
10+
func main() {
11+
gofindthemRules := map[string][]string{
12+
"tag1": {
13+
`"string1"`,
14+
`"string2"`,
15+
},
16+
"tag2": {
17+
`"string3"`,
18+
`"string4"`,
19+
},
20+
"tag3": {
21+
`"string5"`,
22+
`"string6"`,
23+
},
24+
"tag4": {
25+
`"string7"`,
26+
`"string8"`,
27+
},
28+
}
29+
30+
rules := map[string][]string{
31+
"rule1": {`"tag1" or "tag2"`},
32+
"rule2": {`"tag3:Field3.SomeField1" or "tag4"`},
33+
"rule3": {`"tag3:Field3" or "tag4"`},
34+
}
35+
36+
gft, err := finder.NewFinderWithExpressions(
37+
&finder.CloudflareForkEngine{},
38+
&finder.RegexpEngine{},
39+
false,
40+
gofindthemRules,
41+
)
42+
43+
if err != nil {
44+
panic(err)
45+
}
46+
47+
gftg, err := gfinder.NewFinderWithRules(gft, rules)
48+
if err != nil {
49+
panic(err)
50+
}
51+
52+
someObject := struct {
53+
Field1 string
54+
Field2 int
55+
Field3 struct {
56+
SomeField1 string
57+
SomeField2 []string
58+
}
59+
}{
60+
Field1: "some pretty text with string1",
61+
Field2: 42,
62+
Field3: struct {
63+
SomeField1 string
64+
SomeField2 []string
65+
}{
66+
SomeField1: "some pretty text with string5",
67+
SomeField2: []string{"some pretty text with string5", "some pretty text with string2", "some pretty text with string3"},
68+
},
69+
}
70+
71+
matchedExpByFieldByTag, err := gftg.TagObject(someObject, gftg.GetFieldNames(), nil)
72+
if err != nil {
73+
panic(err)
74+
}
75+
76+
for tag, expressionsByField := range matchedExpByFieldByTag {
77+
fmt.Println("Tag: ", tag)
78+
for field, exprs := range expressionsByField {
79+
fmt.Println(" Field: ", field)
80+
for exp := range exprs {
81+
fmt.Println(" Expressions: ", exp)
82+
}
83+
}
84+
}
85+
86+
res, err := gftg.ProcessObject(someObject, gftg.GetFieldNames(), nil)
87+
if err != nil {
88+
panic(err)
89+
}
90+
fmt.Println("ProcessObject: ", res)
91+
92+
fmt.Println("-----------------------------")
93+
arr := []struct {
94+
FieldN string
95+
FieldX string
96+
}{
97+
{FieldN: "some pretty text with string5"},
98+
{FieldN: "some pretty text with string2"},
99+
{FieldN: "some pretty text with string3"},
100+
}
101+
102+
matchedExpByFieldByTag2, err := gftg.TagObject(arr, nil, nil)
103+
if err != nil {
104+
panic(err)
105+
}
106+
for tag, expressionsByField := range matchedExpByFieldByTag2 {
107+
fmt.Println("Tag: ", tag)
108+
for field, exprs := range expressionsByField {
109+
fmt.Println(" Field: ", field)
110+
for exp := range exprs {
111+
fmt.Println(" Expressions: ", exp)
112+
}
113+
}
114+
}
115+
116+
res2, err := gftg.ProcessObject(arr, nil, nil)
117+
if err != nil {
118+
panic(err)
119+
}
120+
fmt.Println("ProcessObject2: ", res2)
121+
122+
fmt.Println("-----------------------------")
123+
rawJson := `
124+
{
125+
"Field1": "some pretty text with string1",
126+
"Field2": 42,
127+
"Field3":
128+
{
129+
"SomeField1": "some pretty text with string5",
130+
"SomeField2":
131+
[
132+
"some pretty text with string5",
133+
"some pretty text with string2",
134+
"some pretty text with string3"
135+
]
136+
}
137+
}
138+
`
139+
140+
matchedExpByFieldByTag3, err := gftg.TagJson(rawJson, gftg.GetFieldNames(), nil)
141+
if err != nil {
142+
panic(err)
143+
}
144+
for tag, expressionsByField := range matchedExpByFieldByTag3 {
145+
fmt.Println("Tag: ", tag)
146+
for field, exprs := range expressionsByField {
147+
fmt.Println(" Field: ", field)
148+
for exp := range exprs {
149+
fmt.Println(" Expressions: ", exp)
150+
}
151+
}
152+
}
153+
res3, err := gftg.ProcessJson(rawJson, gftg.GetFieldNames(), nil)
154+
if err != nil {
155+
panic(err)
156+
}
157+
fmt.Println("ProcessJson: ", res3)
158+
}

finder/finder.go

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,25 @@ func NewFinder(subEng SubstringEngine, rgxEng RegexEngine, caseSensitive bool) (
5555
}
5656
}
5757

58+
// NewFinderWithExpressions retruns a new instace of Finder with the
59+
// expressions and tags given at expressionsByTag.
60+
func NewFinderWithExpressions(
61+
subEng SubstringEngine,
62+
rgxEng RegexEngine,
63+
caseSensitive bool,
64+
expressionsByTag map[string][]string,
65+
) (finder *Finder, err error) {
66+
finder = NewFinder(subEng, rgxEng, caseSensitive)
67+
for tag, expressions := range expressionsByTag {
68+
err = finder.AddExpressionsWithTag(expressions, tag)
69+
if err != nil {
70+
return
71+
}
72+
}
73+
74+
return
75+
}
76+
5877
// AddExpression adds the expression to the finder. It also collect
5978
// and store the terms that are going to be used by the substring engine
6079
// If the expression is malformed returns an error.

group/BUILD.bazel

Whitespace-only changes.

group/README.md

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
# Group Finder
2+
Group finder is a package that adds another DSL to improve the maintainability
3+
of the searched patterns and enables searches on specific fields of structured documents.
4+
5+
## Finder
6+
The finder is used to manage multiple rules. It will use the DSL along with the gofindthem finder
7+
to verify if the tags where found on the specified fields.
8+
9+
### Usage
10+
You will need 2 sets of expressions, one to define the patterns that are needed to
11+
be searched with its tag and the second one with the expressions to define the tags
12+
and field relations.
13+
14+
```golang
15+
gofindthemRules := map[string][]string{
16+
"tag1": {
17+
`"string1"`,
18+
`"string2"`,
19+
},
20+
"tag2": {
21+
`"string3"`,
22+
`"string4"`,
23+
},
24+
"tag3": {
25+
`"string5"`,
26+
`"string6"`,
27+
},
28+
"tag4": {
29+
`"string7"`,
30+
`"string8"`,
31+
},
32+
}
33+
34+
rules := map[string][]string{
35+
"rule1": {`"tag1" or "tag2" and not "tag3"`},
36+
"rule2": {`"tag3:Field3.SomeField1" or "tag4"`},
37+
"rule3": {`"tag3:Field3" or "tag4"`},
38+
}
39+
```
40+
41+
With the 2 sets of expressions ready you will first need to create the
42+
gofinthem finder and the group finder:
43+
44+
```golang
45+
gft, err := finder.NewFinderWithExpressions(
46+
&finder.CloudflareForkEngine{},
47+
&finder.RegexpEngine{},
48+
false,
49+
gofindthemRules,
50+
)
51+
52+
gftg, err := gfinder.NewFinderWithRules(gft, rules)
53+
if err != nil {
54+
panic(err)
55+
}
56+
```
57+
58+
Now its possible check which rules where evaluated as true on a text
59+
or on a structured document:
60+
61+
```golang
62+
// searching on a struct
63+
res, err := gftg.ProcessObject(someObject, gftg.GetFieldNames(), nil)
64+
if err != nil {
65+
panic(err)
66+
}
67+
fmt.Println("ProcessObject: ", res)
68+
69+
// searching on a raw json
70+
res3, err := gftg.ProcessJson(rawJson, gftg.GetFieldNames(), nil)
71+
if err != nil {
72+
panic(err)
73+
}
74+
fmt.Println("ProcessJson: ", res3)
75+
```
76+
The full example can be found at `/examples/group/finder/main.go`
77+
78+
## Group Finder DSL
79+
### Definition
80+
The DSL uses 3 operators (AND, OR, NOT), Tag (defined by "tag:(field)"),
81+
where the field is optional, and parentheses to form expressions.
82+
A valid expression can be:
83+
84+
- A single rule with or without a specific field. Eg: `"tag1"` `"tag1:field1"`
85+
- The result of an operation. `"tag1" OR "tag2:field1"`
86+
- An expression enclosed by parentheses `("tag1" OR "tag2:field1")`
87+
88+
Each operator functions as the following:
89+
90+
- **AND** - Uses the expression before and after it to solve them as a logical `AND` operator.
91+
> (valid expression) AND (valid expression) eg: `"tag1" AND "tag2"`
92+
93+
- **OR** - Uses the expression before and after it to solve them as a logical `OR` operator.
94+
> \<valid expression\> OR \<valid expression\> eg: `"tag1" OR "tag2"`
95+
96+
- **NOT** - Uses the expression after it to solve them as a logical `NOT` operator.
97+
> NOT \<valid expression\> eg: `NOT "tag1"`

group/dsl/BUILD.bazel

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
load("@io_bazel_rules_go//go:def.bzl", "go_library", "go_test")
2+
3+
go_library(
4+
name = "dsl",
5+
srcs = [
6+
"expression.go",
7+
"parser.go",
8+
"scanner.go",
9+
],
10+
importpath = "github.com/pedroegsilva/gofindthem/group/dsl",
11+
visibility = ["//visibility:public"],
12+
)
13+
14+
go_test(
15+
name = "dsl_test",
16+
srcs = [
17+
"expression_test.go",
18+
"parser_test.go",
19+
"scanner_test.go",
20+
],
21+
embed = [":dsl"],
22+
deps = ["@com_github_stretchr_testify//assert"],
23+
)

0 commit comments

Comments
 (0)