⏱ Superfast ^Advanced wildcards++? *,|,?,^,$,+,#,>,++??,##??,>c in addition to slow regex engines and more.
✔ regex-like quantifiers, amazing meta symbols, and speed...
Unique algorithms that was implemented on native unmanaged C++ but easily accessible in .NET through Conari (recommended due to caching of 0x29 opcodes + related optimizations), and others such as python etc.
| Samples ⏯ | regXwild filter | n |
|---|---|---|
| number = '1271'; | number = '????'; | 0 - 4 |
| year = '2020'; | '##'|'####' | 2 | 4 |
| year = '20'; | = '##??' | 2 | 4 |
| number = 888; | number = +??; | 1 - 3 |
| Samples ⏯ | regXwild filter |
|---|---|
| everything is ok | ^everything*ok$ |
| systems | system? |
| systems | sys###s |
| A new 'X1' project | ^A*'+' pro?ect |
| professional system | pro*system |
| regXwild in action | pro?ect$|open*source+act|^regXwild |
It was designed to be faster than just fast for features that usually go beyond the typical wildcards. Seriously, We love regex, I love, You love; 2013 far behind but regXwild still relevant for speed and powerful wildcards-like features, such as ##?? (which means 2 or 4) ...
Unmanaged native C++ or managed .NET project. It doesn't matter, just use it:
C++
#include <regXwild.h>
using namespace net::r_eg::regXwild;
...
EssRxW rxw;
if(rxw.match(_T("regXwild"), _T("reg?wild"))) {
// ...
}C# if Conari
using dynamic l = new ConariX("regXwild.dll");
...
if(l.match<bool>("regXwild", "reg?wild")) {
// ...
}ESS version (advanced EXT version)
| metasymbol | meaning |
|---|---|
| * | {0, ~} |
| | | str1 or str2 or ... |
| ? | {0, 1}, ??? {0, 3}, ... |
| ^ | [str... or [str1... |
| $ | ...str] or ...str1] |
| + | {1, ~}, +++ {3, ~}, ... |
| # | {1}, ## {2}, ### {3}, ... |
| > | Legacy > (F_LEGACY_ANYSP = 0x008) as [^/]*str | [^/]*$ |
| >c | 1.4+ Modern > as [^**c**]*str | [^**c**]*$ |
EXT version (more simplified than ESS)
| metasymbol | meaning |
|---|---|
| * | {0, ~} |
| > | as [^/\\]+ |
| | | str1 or str2 or ... |
| ? | {0, 1}, ??? {0, 3}, ... |
| regex | regXwild | n |
|---|---|---|
| .* | * | 0+ |
| .+ | + | 1+ |
| .? | ? | 0 | 1 |
| .{1} | # | 1 |
| .{2} | ## | 2 |
| .{2, } | ++ | 2+ |
| .{0, 2} | ?? | 0 - 2 |
| .{2, 4} | ++?? | 2 - 4 |
| (?:.{2}|.{4}) | ##?? | 2 | 4 |
| .{3, 4} | +++? | 3 - 4 |
| (?:.{1}|.{3}) | #?? | 1 | 3 |
and similar ...
Play with our actual Unit-Tests.
- ~2000 times faster when C++.
- For .NET (including modern .NET Core), Conari provides optional caching of 0x29 opcodes (Calli) and more to get similar to C++ result as possible.
1.4+
EssRxW::MatchResult m;
rxw.match
(
_T("number = '8888'; //TODO: up"),
_T("'+'"),
EssRxW::EngineOptions::F_MATCH_RESULT,
&m
);
//m.start = 9
//m.end = 15
...
input.replace(m.start, m.end - m.start, _T("'9777'"));tstring str = _T("year = 2021; dd = 17;");
...
if(rxw.replace(str, _T(" ##;"), _T(" 00;"))) {
// year = 2021; dd = 00;
}Open Source project; MIT License, Enjoy 🎉
Copyright (c) 2013-2021 Denis Kuzmin <x-3F@outlook.com> github/3F
regXwild contributors: https://github.com/3F/regXwild/graphs/contributors
We're waiting for your awesome contributions!
- Use the
algosubproject as tester of the main algorithms (Release cfg - x32 & x64) - In general, calculation is simple and uses average as
i = (t2 - t1); (sum(i) / n)where:- i - one iteration for searching by filter. Represents the delta of time
t2 - t1 - n - the number of repeats of the matching to get average.
- i - one iteration for searching by filter. Represents the delta of time
e.g.:
{
Meter meter;
int results = 0;
for(int total = 0; total < average; ++total)
{
meter.start();
for(int i = 0; i < iterations; ++i)
{
if((alg.*method)(data, filter)) {
//...
}
}
results += meter.delta();
}
TRACE((results / average) << "ms");
}for regex results it also prepares additional basic_regex from filter, but of course, only one for all iterations:
meter.start();
auto rfilter = tregex(
filter,
regex_constants::icase | regex_constants::optimize
);
results += meter.delta();
...Please note:
- +icase means ignore case sensitivity when matching the filter(pattern) within the searched string, i.e.
ignoreCase = true. Without this, everything will be much faster of course. That is, icase always adds complexity. - Below, MultiByte can be faster than Unicode (for the same platform and the same way of module use) but it depends on specific architecture and can be about ~2 times faster when native C++, and about ~4 times faster when .NET + Conari and related.
- The results below can be different on different machines. You need only look at the difference (in milliseconds) between algorithms for a specific target.
- To calculate the data, as in the table below, you need execute
algo.exe
340 Unicode Symbols and 10^4 iterations (340 x 10000); Filter: L"nime**haru*02*Magica"
algorithms (see impl. from algo) |
+icase [x32] | +icase [x64] |
|---|---|---|
| Find + Find | ~58ms | ~44ms |
| Iterator + Find | ~57ms | ~46ms |
| Getline + Find | ~59ms | ~54ms |
| Iterator + Substr | ~165ms | ~132ms |
| Iterator + Iterator | ~136ms | ~118ms |
| main :: based on Iterator + Find | ~53ms | ~45ms |
| | | |
| Final algorithm - EXT version: | ~50ms | ~26ms |
| Final algorithm - ESS version: | ~50ms | ~27ms |
| | | |
| regexp-c++11(regex_search) | ~59309ms | ~53334ms |
| regexp-c++11(only as ^match$ like a '==') | ~12ms | ~5ms |
| regexp-c++11(regex_match with endings .*) | ~59503ms | ~53817ms |
ESS vs EXT
350 Unicode Symbols and 10^4 iterations (350 x 10000);
| Operation (+icase) | EXT [x32] | ESS [x32] | EXT [x64] | ESS [x64] |
|---|---|---|---|---|
| ANY | ~54ms | ~55ms | ~32ms | ~34ms |
| ANYSP | ~60ms | ~59ms | ~37ms | ~38ms |
| ONE | ~56ms | ~56ms | ~33ms | ~35ms |
| SPLIT | ~92ms | ~94ms | ~58ms | ~63ms |
| BEGIN | --- | ~38ms | --- | ~19ms |
| END | --- | ~39ms | --- | ~21ms |
| MORE | --- | ~44ms | --- | ~23ms |
| SINGLE | --- | ~43ms | --- | ~22ms |
For .NET users through Conari engine:
Same test Data & Filter: 10^4 iterations
Release cfg; x32 or x64 regXwild (Unicode)
Attention: For more speed you need upgrading to Conari 1.3 or higher !
algorithms (see impl. from snet) |
+icase [x32] | +icase [x64] | |
|---|---|---|---|
| regXwild via Conari v1.2 (Lambda) - ESS | ~1032ms | ~1418ms | x |
| regXwild via Conari v1.2 (DLR) - ESS | ~1238ms | ~1609ms | x |
| regXwild via Conari v1.2 (Lambda) - EXT | ~1117ms | ~1457ms | x |
| regXwild via Conari v1.2 (DLR) - EXT | ~1246ms | ~1601ms | x |
| | | | |
| regXwild via Conari v1.3 (Lambda) - ESS | ~58ms | ~42ms | << |
| regXwild via Conari v1.3 (DLR) - ESS | ~218ms | ~234ms | |
| regXwild via Conari v1.3 (Lambda) - EXT | ~54ms | ~35ms | << |
| regXwild via Conari v1.3 (DLR) - EXT | ~214ms | ~226ms | |
| | | | |
| .NET Regex engine [Compiled] | ~38310ms | ~37242ms | |
| .NET Regex engine [Compiled]{only ^match$} | < 1ms | ~3ms | |
| .NET Regex engine | ~31565ms | ~30975ms | |
| .NET Regex engine {only ^match$} | < 1ms | ~1ms |
regXwild v1.1+ can also be installed through NuGet same for both unmanaged and managed projects.
For .NET it will put x32 & x64 regXwild into $(TargetDir). Use it with your .net modules through Conari and so on.
x64 + x32 Unicode + MultiByte modules;
Please note: Modern regXwild packages will no longer be distributed together with Conari. Please consider to use it separately, Conari nuget packages.
- regXwild NuGet:
- GetNuTool:
msbuild gnt.core /p:ngpackages="regXwild"or gnt /p:ngpackages="regXwild" - GitHub Releases [ latest ]
- 🎲 CI builds:
CI /artifacts( old CI )