Skip to content

Commit 6bad80d

Browse files
committed
improved docs, updated renaming in few places that were missed out
1 parent f40f510 commit 6bad80d

File tree

5 files changed

+33
-24
lines changed

5 files changed

+33
-24
lines changed

README.md

Lines changed: 32 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33

44
- [Regex Benchmark: Performance Visualization Tool for Regular Expressions](#regex-benchmark-performance-visualization-tool-for-regular-expressions)
55
- [Description](#description)
6-
- [Why?](#why)
6+
- [Why Regex Performance Matters?](#why-regex-performance-matters)
77
- [Examples](#examples)
88
- [Installation](#installation)
99
- [Linux](#linux)
@@ -14,18 +14,16 @@
1414

1515
## Description
1616

17-
A utility to help visualise the performance of different regular expressions and how they scale with input size.
17+
Regex Benchmark is a utility tool aimed at demystifying the performance of regular expressions. It provides a visual representation of how different regex patterns perform, especially useful when dealing with complex expressions where performance implications are not immediately apparent.
1818

19-
Typically, in software development we use Big O notation to describe performance. This is typically someone developers learn when first learning to code and is a great way to get a rough idea of how a program will scale with input size. However, this isn't always easy to do with regular expressions. For the mast majority, regular expressions feel like a black box that either works or doesn't. This tool aims to help visualise the performance of regular expressions and how they scale with input size.
19+
## Why Regex Performance Matters?
2020

21-
## Why?
21+
Understanding the performance of regular expressions is crucial for several reasons:
2222

23-
A good question to always ask is "why?". Why do we need to know the performance of regular expressions? It just works, right? Well, yes and no. Regular expressions are a very powerful tool, but they can also be very dangerous. It's very easy to write a regular expression that is very inefficient. This can lead to a number of issues:
24-
25-
* **Security** - If a regular expression is inefficient, it can be used to perform a [ReDoS attack](https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS). This is where an attacker can send a malicious string to a server that will cause the server to hang or crash.
26-
* **Performance** - Can cause a program to run very slowly. This can cause a number of issues such as a poor user experience or a server that is unable to handle the load.
27-
* **Cost** - Slower performance typically leads to higher costs. In the world of cloud computing, we often just create a AWS lambda function and just forget about it. It's really easy to forget that the costs of lambda functions are based on the amount of time the function is running. If a function is running for longer than it needs to, this can lead to a higher than expected bill.
28-
* **Learn from Cloudflare's mistake** - [Cloudflare's Regular Expression Denial of Service](https://blog.cloudflare.com/details-of-the-cloudflare-outage-on-july-2-2019/) is a great article to read which explains how a poorly optimised regular expression caused a global outage.
23+
* **Security** - Avoding [ReDoS attack](https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS) attacks.
24+
* **Performance** - Ensuring efficient execution, crucial for user experience and server load management.
25+
* **Cost** - Minimising computational resources, especially in cloud-based solutions.
26+
* **Real-World Lessons** - Learning from incidents like [Cloudflare's global outage](https://blog.cloudflare.com/details-of-the-cloudflare-outage-on-july-2-2019/).
2927

3028

3129
## Examples
@@ -36,20 +34,20 @@ Let's suppose I want to test the performance of the following three request expr
3634
2. `.*(=).*`
3735
3. `.?(=).*`
3836

39-
All of these expressions are valid, and they all do the same thing. In this example, it's easy to see that the first expression is the most efficient. However, in more complex expressions, it's not always easy to see which expression is the most efficient. This is where `regex-speed` comes in.
37+
All of these expressions are valid, and they all do the same thing. In this example, it's easy to see that the first expression is the most efficient. However, in more complex expressions, it's not always easy to see which expression is the most efficient. This is where `regex-benchmark` comes in.
4038

41-
We'll run test each of the regex expressions defining the following options:
39+
We'll run test each of the regex expressions using the following options:
4240

4341
* `--max-length 100000` - The max string length to test will be 100,000.
4442
* `--step-size 10` - The step size will be 10, meaning the string length will increase by 10 each iteration.
4543
* `--num-tests 5` - We will run each test 5 times plotting each result onto the graph.
4644
* `--required-str x=xxxxxxxxxxxxxx` - The random strings that are generated will have the string `x=xxxxxxxxxxxxxx` somewhere in them.
47-
* `--method find` - We want to actually locate the "=" in the string. This isn't at all useful in a real life example, but it's a nice way to demonstrate the performance of the regex expressions.
45+
* `--method find` - We want to locate the "=" in the string. This isn't at all useful in a real life example, but it's a nice way to demonstrate the performance of the regex expressions.
4846

4947
Let's run the tests:
5048

5149
```bash
52-
regex-speed \
50+
regex-benchmark \
5351
--regex '=' \
5452
--max-length 100000 \
5553
--step-size 10 \
@@ -61,7 +59,7 @@ regex-speed \
6159
![Example 1 results](./docs/examples/img/example-test-1.png)
6260

6361
```bash
64-
regex-speed \
62+
regex-benchmark \
6563
--regex '.*(=).*' \
6664
--max-length 100000 \
6765
--step-size 10 \
@@ -73,7 +71,7 @@ regex-speed \
7371
![Example 2 results](./docs/examples/img/example-test-2.png)
7472

7573
```bash
76-
regex-speed \
74+
regex-benchmark \
7775
--regex '.?(=).*' \
7876
--max-length 100000 \
7977
--step-size 10 \
@@ -88,7 +86,7 @@ Although all three expressions are valid and we can retrieve the "=" from the re
8886

8987
The first expression is the most efficient. Although we do have a couple of outliers, the performance is very consistent giving us a good indication that the result is constant time.
9088

91-
The second expression is the least efficient. We can see a linear growth with a increase of spread as the input size increases. **This might look like a time complexity of O(n), however this isn't actually true. The time complexity is actually O(n^3) in terms of how many steps actually need to be executed. This graph instead represents the actual time to perform the regex search**.
89+
The second expression is the least efficient. We can see a linear growth with a increase in spread as the input size increases. **This might look like a time complexity of O(n), however this isn't actually true. The time complexity is actually O(n^3) in terms of how many steps actually need to be executed. This graph instead represents the actual time to perform the regex search**.
9290

9391
The final expression is the most interesting. We can see that it is slower than the first expression, but faster than the second. The biggest difference is that the spread is absolutely massive!
9492

@@ -97,33 +95,44 @@ The final expression is the most interesting. We can see that it is slower than
9795

9896
### Linux
9997

100-
1. Download the latest `regex-speed` binary from [releases page](https://github.com/Salaah01/regex-benchmark/releases).
98+
1. Download the latest `regex-benchmark` binary from [releases page](https://github.com/Salaah01/regex-benchmark/releases).
10199
2. Ubuntu users will need to install additional dependencies:
102100
```bash
103101
sudo apt install pkg-config libfreetype6-dev libfontconfig1-dev
104102
```
105103
3. Run the binary in the terminal with the `--help` flag to see the available options:
106104
```bash
107-
./regex-speed --help
105+
./regex-benchmark --help
108106
```
109107

110108
### Windows
111109

112-
1. Download the latest `regex-speed.exe` binary from [releases page](https://github.com/Salaah01/regex-benchmark/releases).
110+
1. Download the latest `regex-benchmark.exe` binary from [releases page](https://github.com/Salaah01/regex-benchmark/releases).
113111
2. Run the binary in Powershell/CMD with the `--help` flag to see the available options:
114112
```powershell
115-
.\regex-speed.exe --help
113+
.\regex-benchmark.exe --help
116114
```
117115

118116
## Usage
119117

120118
One you have installed the binary, you can run it with the `--help` flag to see the available options:
121119

122120
```bash
123-
./regex-speed --help
121+
./regex-benchmark --help
124122
```
125123

124+
For a quick start, you can try running one of the regex expressions from the [examples](#examples) above:
125+
126+
```bash
127+
regex-benchmark \
128+
--regex '.?(=).*' \
129+
--max-length 100000 \
130+
--step-size 10 \
131+
--num-tests 5 \
132+
--required-str x=xxxxxxxxxxxxxx \
133+
--method find
134+
```
126135

127136
## Contributing
128137

129-
If you would like to contribute to the project by either reporting a bug, requesting a feature or submitting a pull request, please read the [contributing guide](./CONTRIBUTING.md).
138+
Contributions are welcome! Whether it's reporting bugs, suggesting features, or submitting code changes, please refer to our [contributing guide](./CONTRIBUTING.md). We appreciate contributions from developers of all skill levels.

docs/examples/img/example-test-1.png

1.04 KB
Loading

docs/examples/img/example-test-2.png

845 Bytes
Loading

docs/examples/img/example-test-3.png

-73.7 KB
Loading

src/graph.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ pub fn create(
2626
.x_label_area_size(40)
2727
.y_label_area_size(40)
2828
.margin(20)
29-
.caption("Regex Speed", ("sans-serif", 40).into_font())
29+
.caption("Regex Benchmark", ("sans-serif", 40).into_font())
3030
.build_cartesian_2d(0..max_x, min_y..max_y)?;
3131
graph
3232
.configure_mesh()

0 commit comments

Comments
 (0)