-
Notifications
You must be signed in to change notification settings - Fork 106
Mekhanik evgenii/1411 7 #2403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mekhanik evgenii/1411 7 #2403
Conversation
6df4adf
to
659ade7
Compare
908e709
to
845ba21
Compare
845ba21
to
ac308cf
Compare
ac308cf
to
6154350
Compare
ee8b2c0
to
466c5de
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no crucial problems, but rejected to have a discussion
fw/hpack.c
Outdated
* decoded header (it->to_alloc != 0) we should expand | ||
* header. | ||
*/ | ||
if (!np && !it->to_alloc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only one thing is not clear for me. Why do we do tfw_hpack_exp_hdr()
when it->pos points to the buffer, which was allocated to store previously decoded header
? From my point of view, in this case we can assume that previously allocated buffer(rspace
indicates how many space in this buffer) being allocated at the same page as new buffer(if not np
of course). What I mean:
- We allocate buffer.
- Process header.
- Go to to the next header.
- Use remaining
rspace
, it's not zero, but not enough to process current header completely. rspace
exhausted. We dotfw_pool_alloc_not_align_np()
if!np
it allocates data on the same page where previousrspace
being allocated before.- Adjust the length instead of creating new chunk.
Or there is the case when new buffer will be allocated on the different page? But if so in this case we can't safely use rspace
as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another assumption, we do allocate the buffer from the pool, that allocated per message. It means, we allocate whole page for each message(except few bytes for pool itself), however we trying to predict how many memory pre-allocate and do "small" pre-allocations(actually dealing with pointer arithmetic) per header name/value in this pre-allocated page using BUFFER_GET
. It seems instead of that we can pre-allocate whole page(as we do now) and use it continuously without small allocations of small buffers, just write header continuously in the page and do new allocations if page is exhausted. Isn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we allocate a whole page we loose a lot of memory for a small headers i think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we allocate a whole page we loose a lot of memory for a small headers i think
We always doing it allocating pool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pool is used in several places. So for your assumption it seems we should create another pool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to our discussion we decide to use one strategy for all places where it->pool
is used. and use rspace if it is available
e9e990b
to
29eada6
Compare
fw/hpack.c
Outdated
T_DBG3("%s: it->rspace=%lu, sym=%c\n", | ||
__func__, it->rspace, sym); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
T_DBG3("%s: it->rspace=%lu, sym=%c\n", | |
__func__, it->rspace, sym); | |
T_DBG3("%s: it->rspace=%lu, sym=%c\n", __func__, it->rspace, sym); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes fixed
@@ -1275,10 +1348,17 @@ tfw_hpack_hdr_set(TfwHPack *__restrict hp, TfwHttpReq *__restrict req, | |||
|
|||
d = d_hdr->chunks; | |||
TFW_STR_FOR_EACH_CHUNK(s, s_hdr, end) { | |||
if (unlikely(it->rspace < s->len)) { | |||
r = tfw_hpack_buffer_get(it, it->to_alloc, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we have a discussion with @const-t and can't find a solution. Previously we allocate memory only once data = tfw_pool_alloc_not_align(it->pool, s_hdr->len)
but this could lead to memory loss if there is no enough memory in pool and we allocate new pages. Now I add extra if in the loop and don't loose memory. Whats better? We can call tfw_hpack_buffer_get
only once with true flag at the beginning of the function but loose memory as it was before. @krizhanovsky what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There shouldn't be a lot of chunks, so the loop should be short. I think we're fine, at least none of us is able to say what is better with confedence and it doesn't make sense to microbenchmark this place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just please make memory optimization of TfwMsgParseIter
and a couple of minor cleanups
fw/hpack.c
Outdated
} | ||
return T_COMPRESSION; | ||
|
||
#undef ADJUST_EXTRA_RSPACE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The macro was removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes fixed
fw/hpack.c
Outdated
* `len` bytes; | ||
* - it->rspace is not equal to zero. This means that we | ||
* already have buffer with rspace bytes available. | ||
* If this buffer is rather big to contains the whole |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* If this buffer is rather big to contains the whole | |
* If this buffer is rather big to contain the whole |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes fixed
fw/hpack.c
Outdated
* - it->rspace is not equal to zero. This means that we | ||
* already have buffer with rspace bytes available. | ||
* If this buffer is rather big to contains the whole | ||
* header or it is greater then sizeof(TfwStr) we use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* header or it is greater then sizeof(TfwStr) we use | |
* header or it is greater than sizeof(TfwStr) we use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes fixed
fw/msg.h
Outdated
unsigned long nm_len; | ||
unsigned long to_alloc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need 8 bytes for hdrs_len
, nm_len
, to_alloc
, rspace
- is it OK if the values ever exceed 4G? If we extend the structure, let's make some work in optimizing it a little bit - ideally if it becomes smaller than it was
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes fixed
fw/hpack.c
Outdated
{ | ||
unsigned int order = get_order(sizeof(TfwPoolChunk) + len); | ||
|
||
return (PAGE_SIZE << order) - sizeof(TfwPoolChunk) - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what -1 stands for?
@@ -1275,10 +1348,17 @@ tfw_hpack_hdr_set(TfwHPack *__restrict hp, TfwHttpReq *__restrict req, | |||
|
|||
d = d_hdr->chunks; | |||
TFW_STR_FOR_EACH_CHUNK(s, s_hdr, end) { | |||
if (unlikely(it->rspace < s->len)) { | |||
r = tfw_hpack_buffer_get(it, it->to_alloc, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There shouldn't be a lot of chunks, so the loop should be short. I think we're fine, at least none of us is able to say what is better with confedence and it doesn't make sense to microbenchmark this place
According information from ChatGPT huffman encoding gives compression benefit about 20 - 60 %. According the article from the task #1411 huffman gives benefit about 20 %. We decide to allocate extra 50 % space during huffan decoding to prevent extra allocations (make only one allocation).
- Do not allocate new memory during hpack decoding if some memory is already available. (It can be available because we allocate 150 % memory for each header). - Move BUFFER_GET macros to hpack.c - Some codestyle fixes according checkpatch.pl
There are two cases when we allocate space for header: - it->rspace is equal to zero. In this case we allocate `len` bytes; - it->rspace is not equal to zero. This means that we already have buffer with rspace bytes available. If this buffer is rather big to contains the whole header or it is greater then sizeof(TfwStr) we use it. (Save delta = len - it->rspace bytes. Later when `it->rspace` will be exceeded we use it to allocate new chunk). Otherwise allocate new buffer (if we have some free bytes in old buffer, but is is not enough for the new header and is less then sizeof(TfwStr)), we allocate new buffer to prevent extra header chunking. - When we restore header from hpack table we use force flag for the case when there is no enough space to restore header chunk. In this case we allocate new space in pool to prevent extra header chunking. - Since we use pool only for header during hpack deconding we always reserve the whole page during memory allocation to prevent extra work with pool. Also make codestyle fixes.
- Codestyle and comment fixes. - Decrease size of `TfwMsgParseIter` structure. (max_header_list_size - which is used to limit the size of http/http2 headers is unsigned int, so we can safely change type all variables inside `TfwMsgParseIter` to unsigned int also). - Fix `tfw_hpack_buffer_sz` and `tfw_hpack_huffman_write`.
29eada6
to
a4647fd
Compare
No description provided.