Skip to content

Fixed framing error for resposnses received from cache with header modification #1831

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jun 7, 2023

Conversation

const-t
Copy link
Contributor

@const-t const-t commented Mar 1, 2023

tfw_hpack_cache_decode_expand: fixed http2 framing. However, function looks overcomplicated and combines http1 and http2 related logic.

tfw_cache_h2_decode_write: after decoding header part of TfwDecodeCacheIter was cleared including acc_len and
because of this caller of tfw_cache_h2_decode_write couldn't receive a total length of header. Fixed by clearing only hdr_data and h2_data.

Removed __off from TfwDecodeCacheIter, because we need to clear only two last members in this structure and it can be done via inner struct.

Removed desc from TfwDecodeCacheIter. It's not used.

@const-t const-t marked this pull request as draft March 1, 2023 12:21
@const-t const-t force-pushed the kt-1451-add-hdr-cache branch from fd80c88 to 3af6091 Compare March 1, 2023 13:22
@const-t const-t marked this pull request as ready for review March 1, 2023 15:10
@const-t const-t force-pushed the kt-1451-add-hdr-cache branch 5 times, most recently from 2634d36 to 6fca217 Compare March 1, 2023 17:10
@const-t const-t requested review from krizhanovsky and s0nx March 1, 2023 18:46
@const-t const-t force-pushed the kt-1451-add-hdr-cache branch from 6fca217 to 680e5e0 Compare March 2, 2023 08:37
Copy link
Contributor

@krizhanovsky krizhanovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not fix the code, we must rewrite it. At some point we decided to store HTTP/2 encoded responses in the cache, i.e. HTTP/2 must be the most optimized in our code, probably sacrificing HTTP/1 performance.

However, for now we decode responses fetched from the cache if we have a headers modification logic (quite useful option - it's even used in our current static website for Strict-Transport-Security) and encode them back in tfw_hpack_encode(), i.e. storing responses in HTTP/2 format is not just senseless, but probably makes performance even worse if we'd store responses in HTTP/1 format.

This is essentially #1411 point 15. I think we don't need to introduce a reverse index for now: on the configuration phase we can even just scan the static table (probably starting from accept-charset) to determine the static indexes. The modified headers, which aren't in the static table, we still need to check the dynamic table, but only in tfw_h2_resp_adjust(), and there we can check for duplicate header.

The main point is that we must not decode and encode responses from the cache.

@const-t I'm assigning #1411 to you, at least for this point.

@const-t
Copy link
Contributor Author

const-t commented Mar 25, 2023

Mentioned in #1820 (comment)

@const-t const-t force-pushed the kt-1451-add-hdr-cache branch 3 times, most recently from 251f182 to 1c04b76 Compare May 4, 2023 14:22
Copy link
Contributor

@krizhanovsky krizhanovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This version looks good. I'd only love to make several optimizations to make the code faster

@const-t const-t force-pushed the kt-1451-add-hdr-cache branch 3 times, most recently from 4fa51f3 to 4c96a03 Compare May 19, 2023 14:02
@const-t const-t requested a review from krizhanovsky May 26, 2023 15:30
@const-t const-t force-pushed the kt-1451-add-hdr-cache branch from 4c96a03 to 5499e1b Compare May 31, 2023 09:40
Copy link
Contributor

@krizhanovsky krizhanovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a couple of cleanups.

len : 56;
name_len: 11,
name_len_sz : 2,
len : 35,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

len should be commented above

else
/* Static table starts from first index */
return mid + 1;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to fix this for now, but around the code we use bunch of binary searches, which work on log(n) over arrays just like heaps, but have less friendly CPU prefetching and anticipation access pattern (see e.g. https://en.algorithmica.org/hpc/data-structures/binary-search/#eytzinger-layout). I think in similar cases we should go with heaps in the future.

fw/http_parser.c Outdated
@@ -9621,6 +9629,12 @@ tfw_h2_parse_req_hdr(unsigned char *data, unsigned long len, TfwHttpReq *req,
* previous states trying known header names.
*/
__msg_hdr_chunk_fixup(data, len);
if (parser->hdr.len > HTTP_MAX_HDR_NAME_LEN) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (parser->hdr.len > HTTP_MAX_HDR_NAME_LEN) {
if (unlikely(parser->hdr.len > HTTP_MAX_HDR_NAME_LEN)) {

@const-t const-t force-pushed the kt-1451-add-hdr-cache branch 4 times, most recently from f3598c7 to c40f624 Compare June 5, 2023 15:03
const-t added 11 commits June 6, 2023 15:42
first for decoding HPACK integer second for finding
static index of header by its name.
Header modification table separated logically separated to
two parts, first part for special headers, second for raw headers.
Special headers always located first, thus we can lookup specail
header in newly added lookup-table and if we have raw headers to
modify we must start to scan headers modification table at offset
where raw headers starts therefore skips special headers and don't
traversed twice.
We remove this logic, because `tfw_cache_h2_decode_write()`
not used for HTTP2 reposnses anymore. Now we use only
`tfw_cache_h2_write()` for HTTP2 responses, avoiding
unnecessary "decoding".
fast lookup static headers in TfwHdrMods.

For further optimization we can sort static headers(
as we do with special headers) and save number of it.
Thus we can skip static headers and start iterate raw
headers.
Also added ability for `tfw_cache_build_resp_hdr()`
to go to next chunk if end of previous header located
and the end of the chunk. Before only writing functions
could go to next chunk.
now we place header name without splitting
between record chunks, thus we able to
chnge header name comparision logic
to chunkless comparision and use simple
c strings comparision instead of TfwStr.

Because the header name must be stored
contiguous added limit for maximum header
name length.

Optimizations:
 -Got rid off `tfw_hpack_decode_int()`.
 Length of header name and its size(HPACK int)
 strored in `TfwCStr`.
because header name stored in cache linearly.
@const-t const-t force-pushed the kt-1451-add-hdr-cache branch from c40f624 to 4f528ae Compare June 6, 2023 12:42
@const-t const-t merged commit 1fb13f8 into master Jun 7, 2023
@const-t const-t deleted the kt-1451-add-hdr-cache branch June 7, 2023 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

h2 framing error if response is served cache and header modifications are enabled
4 participants