Mekhanik evgenii/1411 7 #2403

EvgeniiMekhanik · 2025-04-13T21:33:05Z

No description provided.

const-t

There is no crucial problems, but rejected to have a discussion

fw/hpack.c

const-t · 2025-05-30T13:00:05Z

fw/hpack.c

+	 * decoded header (it->to_alloc != 0) we should expand
+	 * header.
+	 */
+	if (!np && !it->to_alloc) {


The only one thing is not clear for me. Why do we do tfw_hpack_exp_hdr() when it->pos points to the buffer, which was allocated to store previously decoded header? From my point of view, in this case we can assume that previously allocated buffer(rspace indicates how many space in this buffer) being allocated at the same page as new buffer(if not np of course). What I mean:

We allocate buffer.

Process header.

Go to to the next header.

Use remaining rspace, it's not zero, but not enough to process current header completely.

rspace exhausted. We do tfw_pool_alloc_not_align_np() if !np it allocates data on the same page where previous rspace being allocated before.

Adjust the length instead of creating new chunk.

Or there is the case when new buffer will be allocated on the different page? But if so in this case we can't safely use rspace as well.

Another assumption, we do allocate the buffer from the pool, that allocated per message. It means, we allocate whole page for each message(except few bytes for pool itself), however we trying to predict how many memory pre-allocate and do "small" pre-allocations(actually dealing with pointer arithmetic) per header name/value in this pre-allocated page using BUFFER_GET. It seems instead of that we can pre-allocate whole page(as we do now) and use it continuously without small allocations of small buffers, just write header continuously in the page and do new allocations if page is exhausted. Isn't?

If we allocate a whole page we loose a lot of memory for a small headers i think

If we allocate a whole page we loose a lot of memory for a small headers i think

We always doing it allocating pool

This pool is used in several places. So for your assumption it seems we should create another pool.

According to our discussion we decide to use one strategy for all places where it->pool is used. and use rspace if it is available

const-t · 2025-06-05T13:41:46Z

fw/hpack.c

+	T_DBG3("%s: it->rspace=%lu, sym=%c\n",
+	       __func__, it->rspace, sym);


Suggested change

T_DBG3("%s: it->rspace=%lu, sym=%c\n",

__func__, it->rspace, sym);

T_DBG3("%s: it->rspace=%lu, sym=%c\n", __func__, it->rspace, sym);

EvgeniiMekhanik · 2025-06-05T18:10:40Z

fw/hpack.c

@@ -1275,10 +1348,17 @@ tfw_hpack_hdr_set(TfwHPack *__restrict hp, TfwHttpReq *__restrict req,

 	d = d_hdr->chunks;
 	TFW_STR_FOR_EACH_CHUNK(s, s_hdr, end) {
+		if (unlikely(it->rspace < s->len)) {
+			r = tfw_hpack_buffer_get(it, it->to_alloc, true);


Here we have a discussion with @const-t and can't find a solution. Previously we allocate memory only once data = tfw_pool_alloc_not_align(it->pool, s_hdr->len) but this could lead to memory loss if there is no enough memory in pool and we allocate new pages. Now I add extra if in the loop and don't loose memory. Whats better? We can call tfw_hpack_buffer_get only once with true flag at the beginning of the function but loose memory as it was before. @krizhanovsky what do you think?

There shouldn't be a lot of chunks, so the loop should be short. I think we're fine, at least none of us is able to say what is better with confedence and it doesn't make sense to microbenchmark this place

krizhanovsky

LGTM, just please make memory optimization of TfwMsgParseIter and a couple of minor cleanups

krizhanovsky · 2025-06-11T14:43:38Z

fw/hpack.c

 	}
 	return T_COMPRESSION;
+
+#undef ADJUST_EXTRA_RSPACE


The macro was removed

krizhanovsky · 2025-06-11T15:25:03Z

fw/hpack.c

+ *   `len` bytes;
+ * - it->rspace is not equal to zero. This means that we
+ *   already have buffer with rspace bytes available.
+ *   If this buffer is rather big to contains the whole


Suggested change

* If this buffer is rather big to contains the whole

* If this buffer is rather big to contain the whole

krizhanovsky · 2025-06-11T15:28:27Z

fw/hpack.c

+ * - it->rspace is not equal to zero. This means that we
+ *   already have buffer with rspace bytes available.
+ *   If this buffer is rather big to contains the whole
+ *   header or it is greater then sizeof(TfwStr) we use


Suggested change

* header or it is greater then sizeof(TfwStr) we use

* header or it is greater than sizeof(TfwStr) we use

krizhanovsky · 2025-06-11T16:37:51Z

fw/msg.h

 	unsigned long	nm_len;
+	unsigned long	to_alloc;


Why do we need 8 bytes for hdrs_len, nm_len, to_alloc, rspace - is it OK if the values ever exceed 4G? If we extend the structure, let's make some work in optimizing it a little bit - ideally if it becomes smaller than it was

krizhanovsky · 2025-06-11T20:34:09Z

fw/hpack.c

+{
+	unsigned int order = get_order(sizeof(TfwPoolChunk) + len);
+
+	return (PAGE_SIZE << order) - sizeof(TfwPoolChunk) - 1;


what -1 stands for?

krizhanovsky · 2025-06-11T20:56:17Z

fw/hpack.c

@@ -1275,10 +1348,17 @@ tfw_hpack_hdr_set(TfwHPack *__restrict hp, TfwHttpReq *__restrict req,

 	d = d_hdr->chunks;
 	TFW_STR_FOR_EACH_CHUNK(s, s_hdr, end) {
+		if (unlikely(it->rspace < s->len)) {
+			r = tfw_hpack_buffer_get(it, it->to_alloc, true);


There shouldn't be a lot of chunks, so the loop should be short. I think we're fine, at least none of us is able to say what is better with confedence and it doesn't make sense to microbenchmark this place

According information from ChatGPT huffman encoding gives compression benefit about 20 - 60 %. According the article from the task #1411 huffman gives benefit about 20 %. We decide to allocate extra 50 % space during huffan decoding to prevent extra allocations (make only one allocation).

- Do not allocate new memory during hpack decoding if some memory is already available. (It can be available because we allocate 150 % memory for each header). - Move BUFFER_GET macros to hpack.c - Some codestyle fixes according checkpatch.pl

There are two cases when we allocate space for header: - it->rspace is equal to zero. In this case we allocate `len` bytes; - it->rspace is not equal to zero. This means that we already have buffer with rspace bytes available. If this buffer is rather big to contains the whole header or it is greater then sizeof(TfwStr) we use it. (Save delta = len - it->rspace bytes. Later when `it->rspace` will be exceeded we use it to allocate new chunk). Otherwise allocate new buffer (if we have some free bytes in old buffer, but is is not enough for the new header and is less then sizeof(TfwStr)), we allocate new buffer to prevent extra header chunking. - When we restore header from hpack table we use force flag for the case when there is no enough space to restore header chunk. In this case we allocate new space in pool to prevent extra header chunking. - Since we use pool only for header during hpack deconding we always reserve the whole page during memory allocation to prevent extra work with pool. Also make codestyle fixes.

- Codestyle and comment fixes. - Decrease size of `TfwMsgParseIter` structure. (max_header_list_size - which is used to limit the size of http/http2 headers is unsigned int, so we can safely change type all variables inside `TfwMsgParseIter` to unsigned int also). - Fix `tfw_hpack_buffer_sz` and `tfw_hpack_huffman_write`.

EvgeniiMekhanik requested review from const-t and krizhanovsky and removed request for const-t April 13, 2025 21:33

krizhanovsky removed their request for review April 14, 2025 12:15

EvgeniiMekhanik force-pushed the MekhanikEvgenii/1411-7 branch 2 times, most recently from 6df4adf to 659ade7 Compare April 15, 2025 01:18

EvgeniiMekhanik mentioned this pull request Apr 15, 2025

H2 related corrections and optimizations. #1411

Open

17 tasks

EvgeniiMekhanik requested a review from krizhanovsky April 15, 2025 02:50

EvgeniiMekhanik force-pushed the MekhanikEvgenii/1411-7 branch 2 times, most recently from 908e709 to 845ba21 Compare April 17, 2025 05:57

EvgeniiMekhanik marked this pull request as draft April 17, 2025 15:32

EvgeniiMekhanik force-pushed the MekhanikEvgenii/1411-7 branch from 845ba21 to ac308cf Compare April 18, 2025 05:10

EvgeniiMekhanik marked this pull request as ready for review April 18, 2025 05:22

EvgeniiMekhanik force-pushed the MekhanikEvgenii/1411-7 branch from ac308cf to 6154350 Compare April 18, 2025 14:16

EvgeniiMekhanik marked this pull request as draft May 2, 2025 20:12

EvgeniiMekhanik force-pushed the MekhanikEvgenii/1411-7 branch 2 times, most recently from ee8b2c0 to 466c5de Compare May 2, 2025 22:04

EvgeniiMekhanik marked this pull request as ready for review May 2, 2025 22:04

const-t requested changes May 30, 2025

View reviewed changes

EvgeniiMekhanik force-pushed the MekhanikEvgenii/1411-7 branch 2 times, most recently from e9e990b to 29eada6 Compare June 2, 2025 19:12

EvgeniiMekhanik requested a review from const-t June 2, 2025 19:12

const-t approved these changes Jun 5, 2025

View reviewed changes

EvgeniiMekhanik commented Jun 5, 2025

View reviewed changes

krizhanovsky approved these changes Jun 11, 2025

View reviewed changes

EvgeniiMekhanik added 4 commits June 12, 2025 19:03

Imrove memory usage during huffman decoding

2ef546d

- Do not allocate new memory during hpack decoding if some memory is already available. (It can be available because we allocate 150 % memory for each header). - Move BUFFER_GET macros to hpack.c - Some codestyle fixes according checkpatch.pl

EvgeniiMekhanik force-pushed the MekhanikEvgenii/1411-7 branch from 29eada6 to a4647fd Compare June 12, 2025 19:57

EvgeniiMekhanik merged commit 32d357a into master Jun 13, 2025
1 check passed

EvgeniiMekhanik deleted the MekhanikEvgenii/1411-7 branch June 13, 2025 03:41

		T_DBG3("%s: it->rspace=%lu, sym=%c\n",
		__func__, it->rspace, sym);

	T_DBG3("%s: it->rspace=%lu, sym=%c\n",
	__func__, it->rspace, sym);
	T_DBG3("%s: it->rspace=%lu, sym=%c\n", __func__, it->rspace, sym);

	* If this buffer is rather big to contains the whole
	* If this buffer is rather big to contain the whole

	* header or it is greater then sizeof(TfwStr) we use
	* header or it is greater than sizeof(TfwStr) we use

Mekhanik evgenii/1411 7 #2403

Mekhanik evgenii/1411 7 #2403

Uh oh!

Conversation

EvgeniiMekhanik commented Apr 13, 2025

Uh oh!

const-t left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

const-t May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

EvgeniiMekhanik Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

krizhanovsky left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

const-t May 30, 2025 •

edited

Loading

EvgeniiMekhanik Jun 5, 2025 •

edited

Loading