Fast HTTP match

# HTTP Load balancing

Separated from #76, in particular from https://github.com/tempesta-tech/tempesta/issues/76#issuecomment-287572183 . A faster implementation of HTTP field matching is required for HTTP load balancing and [filtering](https://github.com/tempesta-tech/tempesta/issues/731). There could be a hash table, such that we can make a quick jump by a rule key and the key can be calculate by the string and ID of the HTTP field. And/or [BNDM with q-Grams (BG) algorithm](https://www.cs.hut.fi/u/tarhio/papers/jea.pdf) can be used to quickly process many strings with common prefix.

Issue #76 works on massive number of backend servers:
```
srv_group group_0 { server 127.0.0.1:9090 conns_n=1; }
srv_group group_1 { server 127.0.0.1:9090 conns_n=1; }
srv_group group_2 { server 127.0.0.1:9090 conns_n=1; }
....
srv_group group_999 { server 127.0.0.1:9090 conns_n=1; }

sched_http_rules {
match group_0  hdr_host eq "group-0.com";
match group_1  hdr_host eq "group-1.com";
match group_2  hdr_host eq "group-2.com";
....
match group_999  hdr_host eq "group-999.com";
}
```

Currently all 1000 and more `match` rules are matched sequentially. The example is quite realistic for massive hosting installations. BG algorithm implemented in #901 must be applied to the matching. Probably matching syntax should be adjusted like (with #731 in mind):
```
host == {
    "group-0.com" -> group_0;
    "group-1.com" -> group_1;
    "group-2.com" -> group_2;
}
```

# HTTPtables

## Strings matching

Also [the use case](https://github.com/tempesta-tech/tempesta/issues/731#issuecomment-364603894) from #731 must be processed in more efficient way, e.g. using hash table or a tree:
```shell
http_chain {
        mark == {
                2 -> backend_0;
                3 -> backend_1;
                4 -> backend_2;
                5 -> backend_3;
                ....
        }
}
```

## Memory spacial locality

At the moment `kzalloc()` is used on configuration phase a lot, so spacial locality on run time can be improved by using more local data structures.

## The chains

Currently HTTPtables sequentially scans all the rules in a chain, which isn't efficient. The first option is to run only one per-header match using multi-pattern matching. Probably, there are also other optimization opportunities.

We need some use cases on large chains to understand the typical workload, i.e. whether there are cases with many patterns for the same headers or there are mostly different headers matchers.


# Generic strings matching

Actually, Tempesta FW is full of multiple strings matching. E.g. [caching policy](https://github.com/tempesta-tech/tempesta/wiki/Caching-Responses#caching-policy) for content type suffix is performed with FOR loop in `tfw_capolicy_match()` while a powerfull web resource [can have a lot of various suffixes](https://omergil.blogspot.com/2017/02/web-cache-deception-attack.html): aif, aiff, au, avi, bin, bmp, cab, carb, cct, cdf, class, css, doc, dcr, dtd, gcf, gff, gif, grv, hdml, hqx, ico, ini, jpeg, jpg, js, mov, mp3, nc, pct, ppc, pws, swa, swf, txt, vbs, w32, wav, wbmp, wml, wmlc, wmls, wmlsc, xsd, zip.

# Testing

## Functional tests

TBD

## Performance

We need a solid estimation on which number of rules and/or chains the performance significantly degrades.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fast HTTP match #732

HTTP Load balancing

HTTPtables

Strings matching

Memory spacial locality

The chains

Generic strings matching

Testing

Functional tests

Performance

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Fast HTTP match #732

Description

HTTP Load balancing

HTTPtables

Strings matching

Memory spacial locality

The chains

Generic strings matching

Testing

Functional tests

Performance

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions