H2 related corrections and optimizations.

### Bugs

- [x] The header 'age' is added unconditionally [here](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/cache.c#L2044); thus, looks like it should be skipped in `tfw_cache_copy_resp()`;
MekhanikEvgenii - will be fixed in https://github.com/tempesta-tech/tempesta/pull/2398

- [x] [data = tfw_pool_alloc_not_align(it->pool, sz)](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/hpack.c#L1098) should be moved down - after the static index processing; 
MekhanikEvgenii  - Fixed by https://github.com/tempesta-tech/tempesta/pull/2350

- [x] Header 'host' can have empty field-value [here](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/hpack.c#L1480) (this possibility is mentioned in [RFC 7230 section 5.4](https://tools.ietf.org/html/rfc7230#section-5.4)); (_Outdated code. Host header can't be empty for http2. http2_general.test_h2_headers contains `test_empty_host_header` that verifies this._)

- [x] Due to the nature of the internal implementation of encoder dynamic table in TempestaFW (for performance purposes), its maximum size is limited - it cannot be greater than HPACK_ENC_TABLE_MAX_SIZE; so for now, if client sends SETTINGS frame with new max window size greater than HPACK_ENC_TABLE_MAX_SIZE, the [table inconsistency](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/http_frame.c#L932) between endpoints may arise; to avoid this problem - the `WinodwUpdate` HEADERS frame with HPACK_ENC_TABLE_MAX_SIZE should be sent to client's decoder (within the first following HEADERS frame of the following SETTINGS acknowledgment, see [RFC 7541 section 4.2](https://tools.ietf.org/html/rfc7541#section-4.2)); _This is #1807_

- [x] The condition [here](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/http_frame.c#L771) should be 
```c
if (hclosed_streams->num <= TFW_MAX_CLOSED_STREAMS
   && (max_streams > ctx->streams_num
      || !hclosed_streams->num))
{
    . . .
}
```
MekhanikEvgenii - Fixed by https://github.com/tempesta-tech/tempesta/pull/2351

- [x] [Duplicated `cookie` headers](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/http_msg.c#L528) should be concatenated before forwarding to HTTP/1.1 connection into one singular header ([RFC 7540 section 8.1.2](https://tools.ietf.org/html/rfc7540#section-8.1.2) p. 5 "Compressing the Cookie Header Field");  #1736 

- [x] It seems like a bug [here](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/http_msg.c#L972), since the function is exiting in any case and the remainder of the `eolen` in the next fragment is not deleted; besides, `tfw_http_msg_del_eol()` function is completely unused for now, as well as [skb_next_data()](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/ss_skb.c#L1010), thus looks like these procedures can be removed;
MekhanikEvgenii - This function is already removed.

- [x] (TBD) For now TempestaFW [closes connection](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/http.c#L5064) in case of any errors during parsing the message, but in case of HTTP/2 connection the malformed messages should be treated as a stream error and not the connection-wide error, which means that connection should not be closed - only stream ([RFC7540 section 8.1.2](https://tools.ietf.org/html/rfc7540#section-8.1.2) p. 6 "Malformed Requests and Responses");
MekhanikEvgenii -  We decide that we can't do it because client add all sent headers to it's hpack dynamic table. When we can't parse message, we don't save headers in hpack dynamic table, moreover we discard remaining message data after error occurs! So we can't to serve new requests from such client.

- [x] ~Looks like [assignment](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_db/core/main.c#L93) should be~ _moved to #515_ 
```c
(*r)->len = curr_ptr - (*r)->data
```

- [x] Currently, [the stream pointer is reset](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/http_frame.c#L744) to avoid double calling of stream FSM if error occurred during response generation; this solution works for cases where possible `tfw_h2_stream_id_close()` calls are close to each other (e.g in `tfw_h2_resp_adjust_fwd()` procedure) and `stream_id` can be saved locally between calls; but this way is not suitable for the cases where `tfw_h2_stream_id_close()` calls are rather far apart - the example of such case is failed redirection response generating: if `tfw_h2_stream_id_close()` had been called in [tfw_h2_prep_redirect()](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/http.c#L421), but then the error occurred before the response is actually created/sent - the [tfw_h2_send_resp()](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/http.c#L927) will not send error response to the client, since `req->stream` is already NULL; to resolve the problem in general way, the `tfw_h2_stream_id_close()` function should be changed a little, to assign `id` to the special new internal field of `req->pit` (instead of returning it as now) and check it during subsequent processing (i.e. if the error response should be generated after failed forwarding response, or failed redirection response generating) - before calling `tfw_h2_stream_id_close()`; if this field is set (non-zero) then `tfw_h2_stream_id_close()` had already been called and it is needn't to be called again - just use the `id` saved in the field; if field is not set (zero) - then `tfw_h2_stream_id_close()` should be called; this field need not be synchronized, since it can be accessed only from one thread (unlikely the `req->stream` field) in which current request is processed now (during forwarding request to server, or forwarding corresponding response to client); this approach should allow to completely avoid the problem with multiple calling to `tfw_h2_stream_id_close()` during internal responses generation (including failed redirection response generating) and during server response forwarding. 
MekhanikEvgenii - This problem was fixed in several different PRs. First of all this part of code was reworked during moving making frames to xmit callback and during moving stream processing (send part) to the xmit callback also. Finally problem was fixed in https://github.com/tempesta-tech/tempesta/pull/2338/commits/da24b86d0b3d25f199d5b90cd79da9b72cacf3b8

### Optimizations

- [x] It seems that just one call of `tfw_http_hdr_split()` procedure can be used [here](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/cache.c#L1298) instead of `tfw_http_msg_srvhdr_val()` call and `tfw_h2_msg_hdr_length()` call below (i.e. for now @h_val and @s_val are two variables for the same instance);
MekhanikEvgenii - Will be fixed by https://github.com/tempesta-tech/tempesta/pull/2368. It seems that we can remove `tfw_http_hdr_split` instead of `tfw_http_msg_srvhdr_val`

- [x] In case of response forwarding (TFW_H2_TRANS_INPLACE type of H2 encoding operations) the `tfw_http_hdr_split()` procedure is called twice for the same header: [here](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/hpack.c#L2988) and [here](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/hpack.c#L3596); maybe it is worth to call this procedure only once, but earlier - in the beginning of @tfw_hpack_encode() function, before the `tfw_hpack_encoder_index()` procedure call. See also https://github.com/tempesta-tech/tempesta/pull/1920#discussion_r1263825337 and https://github.com/tempesta-tech/tempesta/pull/1920#discussion_r1276817864
MekhanikEvgenii - fixed in https://github.com/tempesta-tech/tempesta/pull/2367

- [x] In case of Huffman-encoded header name and/or value - it is worth to allocate (and assign to `it->rspace`) a bit more size of memory [here](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/hpack.h#L258); e.g. as mentioned in [Exploring HTTP/2 Header Compression](https://www.mew.org/~kazu/doc/paper/hpack-2017.pdf) section 3.2 - Huffman can compress by 20% on average, which means that allocation about 125% of initial Huffman-encoded size (i.e. `res_len = len + len/4`) will help to avoid in most cases additional space allocations and `TfwStr` expansions in `tfw_hpack_huffman_write()` procedure (note, that in this case the warnings for `it->rspace` [here](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/hpack.h#L256) and in other similar places should be removed);
MekhanikEvgenii - will be fixed in https://github.com/tempesta-tech/tempesta/pull/2403

- [x] If `tfw_hpack_exp_hdr()` function is called [here](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/hpack.c#L293), then the intersection with `it->parsed_hdr` descriptor allocations in `req->pool` is occurred, and the following parsing of the header's value will require the full re-allocations in `req->pool` (instead of fast path) in `tfw_pool_realloc()` procedure; this problem can be avoided either via the another pool addition (specially for `it->hdr` descriptors), or via the re-initialization of `it->hdr` descriptor; in the latter variant the `it->hdr` will be reinitialized in any case here -  with new values of `it->pos` and `length` (like in `BUFFER_NAME_OPEN()` macro), and old descriptors(s) in `it->hdr` for header's name will be just forgotten (as they are not needed any more, since the name is parsed by now, and all necessary information is already in `it->parsed_hdr`); note however, that in this variant the intersection with `it->parsed_hdr` descriptor allocations in `req->pool` can still take place if `it->hdr` became compound during Huffman decoding of header's value, but considering changes proposed in p.3 above - the case of `tfw_hpack_exp_hdr()` calling from `tfw_hpack_huffman_write()` should be very rare, thus in most cases `it->hdr` should remain plain and intersections will not occur;
MekhanikEvgenii - will be fixed in https://github.com/tempesta-tech/tempesta/pull/2403 and https://github.com/tempesta-tech/tempesta/pull/2447

- [x] For now TempestaFW does not know static indexes for headers which are defined in configuration file for [adding](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/http.c#L4646), [substituting](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/http.c#L4166) or [appending](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/http.c#L4161), but this headers are fully known on configuration stage, so it is worth try to determine static indexes for them; looks like two ways are possible: either to allow the specification of static indexes for corresponding headers directly in the configuration file, or to implement something like tag search in [Exploring HTTP/2 Header Compression](https://www.mew.org/~kazu/doc/paper/hpack-2017.pdf) section 3.1 (in part devoted to headers defined in static table) - on configuration stage. _Done in #1831._

- [x] Since [tfw_h2_stream_id()](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/http_frame.c#L700) function is called only from request receive path, and there must not be concurrent access to `req->stream` until request will be passed to the server connection - locking in `tfw_h2_stream_id()` can be removed;
MekhanikEvgenii - there is a PR https://github.com/tempesta-tech/tempesta/pull/2365, which should fix this problem. In fact this point is not correct: when cache is not shard we don't need to call it at all, because stream_id can't be 0 here. If cache is shard there can be concurrent access to `req->stream`, because cache processing can be called on other cpu. But there is no sence to check stream_id also, because if stream can be closed in any time! For example if `stream_id` is not equal to zero before `tfw_cache_build_resp` there is no guarantee that it will not became zero during this function call! We check that `req->stream` is not NULL during calling `tfw_h2_stream_init_for_xmit` when we access `req->stream` pointer.

- [x] Perhaps it would be wise to create the version of [tfw_h2_stream_id_close()](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/http_frame.c#L723) procedure for cases when it is called from request receive path (e.g. from request processing in cache - [here](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/cache.c#L2182) and [here](https://github.com/tempesta-tech/tempesta/blob/50ce6e289b989e396d7a6d17797b23f843efa21c/tempesta_fw/cache.c#L769)), since there must not be concurrent access to `req->stream`, and it must be valid (non-NULL) until request will be passed to the server connection.
MekhanikEvgenii - there is a PR https://github.com/tempesta-tech/tempesta/pull/2365, which should fix this problem. In fact this point is not correct: when cache is not shard we don't need to call it at all, because stream_id can't be 0 here. If cache is shard there can be concurrent access to `req->stream`, because cache processing can be called on other cpu. But there is no sence to check stream_id also, because if stream can be closed in any time! For example if `stream_id` is not equal to zero before `tfw_cache_build_resp` there is no guarantee that it will not became zero during this function call! We check that `req->stream` is not NULL during calling `tfw_h2_stream_init_for_xmit` when we access `req->stream` pointer.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

H2 related corrections and optimizations. #1411

Bugs

Optimizations

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

H2 related corrections and optimizations. #1411

Description

Bugs

Optimizations

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions