-
Notifications
You must be signed in to change notification settings - Fork 6
Proposal for tests architecture. #96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,381 @@ | ||
# Updating test framework | ||
|
||
|
||
## Requirements | ||
|
||
There are several types of tests (some of them are only planned and not | ||
implemented yet): | ||
|
||
- Message integrity. In these tests it's important to validate that the | ||
message received from Tempesta is equal to the expected one. It applies to both | ||
forwarded request and response. There should be an option to split messages sent | ||
to Tempesta in every possible set of skbs. | ||
|
||
- Invalid message processing. Tests to validate behaviour on invalid message | ||
parsing/processing. Tempesta may send responses to sender or close the | ||
connection. | ||
|
||
- Scheduler tests. It's important which server connection was used to serve | ||
the request. | ||
|
||
- Cache tests. It's important whether a server connection was used to serve | ||
the request or not. | ||
|
||
- Request-response chain integrity. Need to validate that every client receives | ||
its responses in the correct order. | ||
|
||
- Fault injections. Based on other types of tests, but random errors to be | ||
emulated using SystemTap or eBPF. | ||
|
||
- Fussing. Generating random requests and responses, Tempesta must stay | ||
operational and serve as many requests as possible. | ||
|
||
- Workload testing. Generate pretty high request rate to confirm data safety | ||
on highly concurrent code execution. The more CPUs is used for Tempesta - | ||
the better. But CPU utilisation must be less than 80% to assure that no extreme | ||
conditions happen. | ||
|
||
- Stress testing. Testing under extreme conditions, when request rate is higher | ||
than Tempasta can serve. | ||
|
||
- Low level TCP testing. Some features can affect TCP behaviour, so need to | ||
validate that the Tempesta conforms RFCs and doesn't break kernel behaviour. | ||
|
||
- TLS testing. Having in-house TLS implementation gives great powers and great | ||
responsibility. We must assure TLS support messages integrity, encryption | ||
integrity. It differs from other tests, since it's important to work with raw | ||
data lying under HTTP on client and server side. | ||
|
||
- Tempesta-originated requests. Pretty specific tests, since Tempesta may | ||
generate requests on its own, e.g. background cache revalidation, or health | ||
monitor requests. It differs from other tests, since it's important to validate | ||
support TCP messages. | ||
|
||
|
||
There also some requirements on tests process: | ||
|
||
- Full, reduced and extra-stability set of tests. Reduced for developers to | ||
run before making pull requests. Full to run against pull requests. | ||
Extra-stability for every night testing to assure stability under unpredictable | ||
loads. | ||
|
||
- Easy new test addition. | ||
|
||
- Easy test case failure analysis. | ||
|
||
- Generate a few tests based on singe test description. E.g. in fussing test | ||
it's required to know, how the Tempesta would react on a message, and what | ||
should be threated as error. The 'splitting on different skbs' is actually | ||
fussing test feature, not functional. Usually it also make sense to generate | ||
workload test based on functional profile. | ||
|
||
|
||
## Problems of current implementation | ||
|
||
The main problem is having two frameworks in the same time. Old, feature rich, | ||
but too sophisticated, with requirement to understand 5-6 files to understand | ||
what the test is actually does. And a new, more friendly, but not complete and | ||
with lack of the most important features. Transition of an old one to a new | ||
wasn't completed, and the development of both was rather unorganized, thus | ||
new framework is becoming more and more complex and loses its advantages. | ||
|
||
In both frameworks every test type is implemented as unique test, so it's | ||
required to write message integrity, fussing, workload and many other test | ||
cases for the same situation. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I still have no idea how to easily make the test framework to split a message into different sets of skbs. Probably we can use I suppose the the message integrity and probably skb splitting can be done on deproxy side. If we need the same for Nginx tests, then we can just implement the logic as a separate class/module. All in all, with https://www.youtube.com/watch?v=oO-FMAdjY68 and the requirements above in mind, I believe that the test framework should use OO relatively complex class inheritance for the helping functionality while the tests should just on/off necessary flags for message integrity calculation, sending in different skb and other things. I.e. the helping framework must provide as declarative API as possible and it's not so important how complex it internally is. |
||
|
||
Heavy dependency between tests. When a new feature is added, it's added to the | ||
test framework core and affects a lot of tests. Sometimes unpredictably and | ||
badly. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, we had them. We need to discuss this in chat considering particular cases. There are different possibilities what happens when we're in the situation - we grow the framework that it can run many different things in future or we just keep case-specific login in the framework. Let's talk on the cases in the chat. The cases should make the framework design problems clear. |
||
|
||
|
||
## Proposed design | ||
|
||
The proposed design improves the new test framework and intends to deliver it to | ||
full-feature test tool. | ||
|
||
The whole test case is described in single file, configuration of each entity | ||
is listed as plain text template with some parameters. No procedure generated | ||
configs like in old framework. | ||
|
||
Most of the test cases should use following description: | ||
``` | ||
|
||
class SomeTest(BaseTestClass): | ||
|
||
backends = [# For all tests except stress tests: | ||
{ | ||
'type' : 'deproxy', | ||
'id' : 'id1', | ||
'response' : gen_response_func, | ||
}, | ||
# OR: | ||
{ | ||
'type' : 'wrk', | ||
'id' : 'id1', | ||
'script' : script_name, | ||
# Other wrk-specific options. | ||
}, | ||
] | ||
clients = [# For all tests except stress tests: | ||
{ | ||
'type' : 'deproxy', | ||
'id' : 'id1', | ||
'request' : gen_request_func, | ||
'response' : response_cb, | ||
}, | ||
# OR: | ||
{ | ||
'type' : 'wrk' | ||
'id' : 'id1', | ||
'script' : script_name, | ||
# Other wrk-specific options. | ||
}, | ||
] | ||
tempesta = { | ||
'config' : '...' | ||
} | ||
test_variations = ['functional', 'workload', 'fussing'] | ||
|
||
def test(self): | ||
< ... test procedure steps ... > | ||
pass | ||
``` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Update It seems the callbacks are only needed to generate different requests and responses for single deproxy/tempesta run. Actually Please review There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The How the current tests in the new framework run:
How to make test scalable:
The second case is more scalable, the test itself is tolerant to number of concurrent connections. May be it's a my misbelief, but I think that scalability from one connection to thousands is important. Issue #107 is very tightly connected here. It's a huge work while we have a plenty of other tasks, I understand that. But I don't know more suitable tools here. Sure, some special changes will be required to make tests scalable, e.g. in cache tests each client must request unique URIs; in scheduler tests we would like to see backend server ID in the response, anyway it's better than check request counters across all backends. |
||
|
||
It doesn't look very different from the current description of tests in the | ||
new framework, but it completely different under the hood. So tests procedures | ||
will significantly change. | ||
|
||
|
||
### Client and Server Mocks | ||
|
||
Each client and server are started in separate process. This is required since | ||
python has no real multithreading capabilities (see Global Interpreterer Lock). | ||
There already are multiple processes in the new framework, but final assertions | ||
are done in main process making it difficult to check the used connections, | ||
clients, servers and order. | ||
krizhanovsky marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
With the proposed solution each mock client and server is to become separate | ||
process with configured request-response sequence, all assertions is to be done | ||
inside that mock processes. With this approach all tests will be closer to | ||
mock client and server programming than programming of tests routines. | ||
|
||
|
||
#### Mock Client | ||
|
||
Separate process, started by the test framework. Has following arguments: | ||
|
||
- Script. - filename with the request/response functions. The content of it | ||
is described later. | ||
|
||
- ID. - client id. | ||
|
||
- Server address. - Address of Tempesta, IP and port. | ||
|
||
- Interfaces. - interfaces the client sockets will be binded to | ||
(using `SO_BINDTODEVICE` socket option). | ||
|
||
- Concurrent connections. - Number of concurrent connections for each defined | ||
interface. | ||
|
||
- Debug file. - Filename to write debug information. | ||
|
||
- Repeats. - Number of repeats of the request/response sequence before connection | ||
is closed. | ||
|
||
- RPS. - maximum rps. | ||
|
||
- Split. - Boolean value, meaning that requests must be splited in multiple skbs. | ||
Don't know how to implement it yet. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please have a look on |
||
|
||
Client has two structures to contain required information: `client_data` for | ||
global client data and `conn_data` for per connection data. | ||
|
||
First, client waits for POSIX barrier until all other mock servers are ready. | ||
Then for every connection it does the following algorithm: | ||
|
||
1. Open a new connection and bind it to interface. Set `conn_data.request_id` | ||
to `0`. | ||
|
||
2. Call `gen_request_func()` from the script to generate the request. | ||
|
||
3. Copy generated request to send buffer. Push request to `conn_data.request_queue` | ||
and increment `conn_data.request_id`. | ||
|
||
4. If not `conn_data.pipeline` go to step `5`, otherwise go to step `2`. | ||
|
||
5. Send current buffer to Tempesta. Start `conn_data.req_timeout` timer. | ||
|
||
6. Receive as many as possible. If `conn_data.req_timeout` is expired go to `err` | ||
|
||
7. Try to parse response from the buffer. If full response is received go to `8`, | ||
else go to `6`. | ||
|
||
8. For complete response run `response_cb()` from the script. If error happen | ||
go to `err`. Remove request from `conn_data.request_queue` If there are request | ||
unreplied, go to `6`. Else stop `conn_data.req_timeout` timer. | ||
|
||
9. If there are unsent requests go to `2`. | ||
|
||
10. Close connection. If `conn_data.repeats` go to `1`. | ||
|
||
11. Exit. | ||
|
||
`err.` Build error message, close connection and exit. | ||
|
||
|
||
The algorithm talks about two functions: `gen_request_func()` and `response_cb()`. | ||
Lets take a closer look. | ||
|
||
`def gen_request_func(client_data, conn_data)`. Returns `request` if any or `None`. | ||
Generic implementation should build the following request: | ||
``` | ||
GET / HTTP/1.1 | ||
T-Client-Id: %client_data.client_id% | ||
T-Conn-Id: %conn_data.conn_id% | ||
T-Req-Id: %conn_data.request_id% | ||
|
||
``` | ||
In the same time the function can modify following variables: | ||
`conn_data.pipeline` - if set, the next request bust be pipelined. In defaults: `false`. | ||
`conn_data.expect_close` - Connection should be closed by the peer after response to this request, `false` by defaults. | ||
`conn_data.expect_reset` - Connection should be closed immediately, `false` by defaults. | ||
`conn_data.close_timer` - Timer is set if connection is to be closed. | ||
|
||
If a response is expected for the connection, `request.expected_response` must | ||
be created. | ||
|
||
Other variables may be added or set in `conn_data` or `client_data`. | ||
|
||
`def response_cb(client_data, conn_data, response, closed)`. Return bolean success code. | ||
The client receives some response: | ||
``` | ||
HTTP/1.1 200 OK | ||
T-Client-Id: %client_data.client_id% | ||
T-Conn-Id: %conn_data.conn_id% | ||
T-Req-Id: %conn_data.request_id% | ||
< ... Standard Tempesta's headers ... > | ||
|
||
``` | ||
All headers including `T-Client-Id`, `T-Conn-Id`, `T-Req-Id` must match | ||
`conn_data.requests[0].expected_response`. Not all header may be matched | ||
exactly, especially `Date`, `Age` and other time headers. If so, headers can't | ||
have values smaller than expected. `T-Srv-Id` and `T-Srv-Conn-Id` headers may be | ||
ignored in most tests. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can't we do the same client/connection/request tracking in current deproxy? |
||
|
||
Boolean `closed` argument is set to true if the connection was closed by the peer. | ||
The function is also called if the connection was reset by peer, but no response | ||
was received. In this case `response` is set to `None`. | ||
|
||
If received response doesn't match expected one, then an error message is generated: | ||
``` | ||
Error on request-response processing! | ||
For request: | ||
--- | ||
<Request> | ||
--- | ||
Response was expected: | ||
--- | ||
<Expected response> | ||
--- | ||
But received response was: | ||
--- | ||
<Received response> | ||
--- | ||
``` | ||
Error should also be generated if the connection was closed unexpectedly or | ||
connection wasn't closed as expected. | ||
|
||
|
||
If debug is enabled, write to file `debug_filename_%connid%` any activity on the | ||
connection with the timestamp: received bytes, sent bytes, and parsed messages. | ||
|
||
|
||
#### Mock Server | ||
|
||
Follows the same concept as mock client. Arguments: | ||
|
||
- Script. - filename with the request/response functions. The content of it | ||
is described later. | ||
|
||
- ID. - server id. | ||
|
||
- Interface:conns. - interfaces the client sockets will be listening and number | ||
of Tempesta's connections for it. | ||
|
||
- Split. - Boolean value, meaning that requests must be splited in multiple skbs. | ||
Don't know how to implement it yet. | ||
|
||
- Debug file. - Filename to write debug information. | ||
|
||
Server is stopped by `TERM` signal from the framework. | ||
|
||
Script contains queue of expected requests `server_data.requests[]`. If all | ||
requests was received, but a new request is received, then client is working in | ||
repeated mode, thus `conn_data.expected_request_num` must be reset to `0`. | ||
|
||
`def gen_response_func(server_data, conn_data, request)` function is used to | ||
give a response for the client. Returns `response` or `None`. The received | ||
request is compared with `server_data.requests[T-Req-Id]` | ||
but note, that `T-Client-Id`, `T-Conn-Id`, `T-Req-Id` values may be missed in the | ||
expected request. | ||
|
||
The generated response **must** contain headers copied from the request: | ||
`T-Client-Id`, `T-Conn-Id`, `T-Req-Id`. `T-Srv-Id` and `T-Srv-Conn-Id` must | ||
also be generated. | ||
|
||
If the server need to close the connection in response to request, it sets | ||
`conn_data.close` variable. The variable `conn_data.expect_close` can also be | ||
set if generated response is invalid and Tempesta will reset the connection. | ||
|
||
|
||
### Framework | ||
|
||
The framework now is defined not as list of actions but as HTTP flows on client | ||
and server. This allows to replace Tempesta by other HTTP ADC software and easily | ||
compare behavior between them. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Each (or almost each) test case must configure an ADC and the configuration language and features are very different for different ADCs. Nginx already has it's own and pretty large test suite, but we can't just use it - mostly because of differences in configurations and features. |
||
|
||
This also allows to create multiple tests from the same description: | ||
|
||
- Message integrity, Invalid message processing, Cache tests: single client, | ||
single client connection. | ||
|
||
- Scheduler, Workload, Request-response chain integrity tests: multiple clients | ||
from many source interfaces, multiple concurrent connections for each client. | ||
|
||
- Fuzzing tests: Workload tests, but every generated messages are split into | ||
different number of skbs. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For me it seems that the things can be done in deproxy. |
||
|
||
When reduced number of tests is executed, only first group of tests is executed, | ||
the full set is for first and second groups. All groups are executed in background | ||
stability tests. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Make sense for generic case, but not so much sense if a developer changes low level logic w/o changing any features and he's curious about network operation. I propose to be able to set this in config as well as for particular tests. In this way you can run for example a particular TLS test with message integrity check and/or fuzzing (strictly speaking, skb splitting isn't a real fuzzing) at the same time some tests oriented on the low level logic have sense only with one of few options enabled. |
||
|
||
At the first sight Cache tests can't be executed multiple times. | ||
But if `URI` and/or `Host:` mutated between different client connections and | ||
repeats it becomes possible. Thus Cache Workload testing should be fine. | ||
|
||
Fault injections tests are very similar to Message integrity tests, but a few | ||
prior actions to inject errors are required before starting the client. | ||
|
||
|
||
#### Prepare to start | ||
|
||
Copy functions defined in `clients` and `server` array of test class to separate | ||
file to start clients and servers as distinct processes. Probably other python | ||
interpreters can be used for that. `Pypi` is extremely fast in comparing to | ||
standard interpreter. This will allow pretty good RPS in work load tests. | ||
|
||
The framework must create all required interfaces before start. | ||
|
||
POSIX barrier is used to synchronize clients and servers. Clients shouldn't | ||
start generating requests before all server connections are established. | ||
|
||
|
||
## TBD | ||
|
||
TLS tests - ? | ||
krizhanovsky marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Low level TCP testing - ? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With the recent TLS changes affecting TCP transmission process we have to test TCP operation (@i-rinat has questions regarding TCP segment sized generated by Tempesta). The question will be even harder for HTTP/3. Probably ScaPy isn't a good option for this, because it'd be too tricky to generate TCP flow by hands. Probably iptables mangling and/or eBPF with usual TCP options like Nagle will suite the best. We already have something in |
||
|
||
Code coverage -? | ||
krizhanovsky marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
There may many thing to be discussed, feel free to raise questions. |
Uh oh!
There was an error while loading. Please reload this page.