Standalone HttpClient's source code #96938
-
I'd like to make some changes to HttpClient, specifically in regards to much more optimal handling of headers as even relatively recently addded NonValidated ones ( #53555 ) are still very bad when it comes to allocations related to headers in scenarios when all of them are needed to be handled (or just saved in bulk for processing later). I was thinking to modify as part of whole thing, but all my attempts to build my own .NET 8 on Windows and Linux failed, and trying to extract relevant bits for HttpClient only also failed because it seems to drag endless amount of other files from framework. Putting them into different namespace for modification prevents access to some internals in their own namespace also. In the past I think there was a standalone HttpClient source code that could have compiled separately, is that still the case or the whole thing is so tighly linked into new framework that it can't be improved on its own? So far it feels like building your own version of framework is the only path forward, but it's less than ideal for sure, even if it worked perfectly. |
Beta Was this translation helpful? Give feedback.
Replies: 7 comments 7 replies
-
The repo structure is not optimized for extracting all sources for a single library easily. We share a lot of sources between different libraries as you have found.
HttpClient sources were always part of a larger repo, with build system integrated with the rest of the repo.
It depends on how much work you want to do on your build system. You should be able to extract the required sources and build your version of HttpClient only. It is "just work". |
Beta Was this translation helpful? Give feedback.
-
We will step up and do what it takes, thank you. |
Beta Was this translation helpful? Give feedback.
-
I am forcing HTTP/1.1 - connection should be pooled, I've set these -
So I was expecting buffers to be released after 10 seconds, I don't think this is happening over 50 seconds, which should be enough for some of them to give up and return buffer back. Even if they are hanging around why keep empty buffer? It should be returned back to pool, and re-requested if new data coming in. I understand that protocol might require some "pinging", but does that need full 4k buffer to be occupied all the time? Also, ideally there should be a way for the user to indicate if a particular request connection does NOT need to be pooled - user may know that there will be only 1 connection to a given site, so pooling should not be used for that. A lof of those headers added when the request is made -
They are even default, yet this creates a lot of allocations, instead of creating Key/Value pairs they (in my opinion) should have been kept in byte[] - even when changed by user (how often does that happen?) for requests then it's cheap CPU wise to recompact whole buffer with updated value. Then write whole lot in request without messing with strings. GC collections for the above test run - My understanding is that they are happening mostly because of high allocation count leads GC to run cycle, lots of survivors - which is not suprising if Key/Value pairs get stuck in objects. GC settings -
Don't want to use Server for this, in any case it won't make those allocations disappear. |
Beta Was this translation helpful? Give feedback.
-
I've read your link, thank you, very interesting, but how to use that zero-byte reads? I am getting stream like this -
I read Stream until bytes returned are 0. Should I explictly ask to read 0 bytes into buffer after I receive no more bytes available, would that release internal buffers? |
Beta Was this translation helpful? Give feedback.
-
By the way I've migrated code from HttpWebRequest with its callbacks (not tasks), and I was unpleasably surprised to see a LOT of task allocations compared to previous solution, it's number 2 of top allocations in my case - IMHO good old async callbacks would have been FAR more preferrable to have, are they still available in HttpClient? It would have been also VERY helpful to be able to provide DNS info to request - we always first do it to check some things, and even though DNS server will have it cached there is no sense to do it second time if this data is available. Perhaps callback to custom resolve DNS would have been possible? |
Beta Was this translation helpful? Give feedback.
-
Ah great about DNS, will give it a go, thanks! |
Beta Was this translation helpful? Give feedback.
-
My much smarter than me colleague got .NET source code compiling, so we'll shortly give it a go at modifications and report back. In the meantime I've noticed one odd thing - I've setup HttpClient to accept HTTP/1.1 or lower as follows -
Connections when accepted indeed declare as HTTP/1.1 however I've noticed in profiler a lot of small byte[] allocations that come from HPack and QPack which I believe used in HTTP/2 and 3 - to me this seems to violate specified policy - HttpVersionPolicy.RequestVersionOrLower - this causes unnecessary small allocations, a LOT of them. Is this intentional and how it's supposed to work? Using these settings seems to stop this from happening -
I've stepped into HttpClient source and it seems to have used HTTP/1.1 correctly, so no idea why HPack/QPack allocations were happening, they seems to have been consistent with number of request being made. |
Beta Was this translation helpful? Give feedback.
The repo structure is not optimized for extracting all sources for a single library easily. We share a lot of sources between different libraries as you have found.
HttpClient sources were always part of a larger repo, with build system integrated with the rest of the repo.
It depends on how much work you want to do on your build system. You should …