-
Notifications
You must be signed in to change notification settings - Fork 305
SqlClient 6.0.1 DataReader hangs on trying to read 1 row with 150Mb value in nvarchar(max) column - consume huge amount of memory, and do not react to cancelationToken #3274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I have tried do same read operation synchronously. But memory consumption by this operation is inadequate also. 159Mb UTF8 string loaded from DB and converted to 318Mb UTF16 .net string, it is not good for app, but this behavior predicted and logic.
Hope current usage of ArrayPool is also improved by mentioned PR. PS All this is looks like good material for @stephentoub articles about performance improvements. |
performance and memory usage change slighlty. see benchmarks in #3161 (comment) for some information.
Storing large strings isn't a good use case for a relational database. If you want to do that there are many Document database options. The async performance was always poor because it used a replay mechanism. Why? I don't know.
It's really really hard to fix without breaking anyone or changing behaviour. Look through #2690, #2714 and #3161 for how much work it took.
This is the recommended and supported driver for connecting to Sql Server from .net apps. It is open source. From the time that it became available for contributors to improve as part of netcore people have been improving it. It's a complex library with few contributors and very high compatibility requirements.
If you want to read large amounts of text data from the database there are better ways to do it than to read into a string. Consider using streaming support instead https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sqlclient-streaming-support |
This is a pathological case and the longer the string the worse it gets because the async multi-packet string behaviour is factorial. Most people will not see anywhere near that performance problem because storing 100Mib strings in a database is quite rare. Combine that with it being fairly easy to work around means it may not have been reported by many people and the people that do report it don't sustain pressure. |
Can you please check if cancellationToken checked time to time when reader do it's work? For me it hardly to imagine how multi-packet processing must be implemented to have factorial complexity. Complexity of such algorithm will be O(data size), and mem overhead 1..2*[data size] if last step can do conversion without intermediate buffer, or 1..4*[data size] if all intermediate buffers to be copied in some additional final buffer. Where i am wrong? |
That looks like how it should work, but that isn't how it currently works in released code. In the next release with #3161 it will continue instead of replay from the beginning each time it receives a packet. It's known that it wasn't working well and that is why it has been changed. For all the details read the PR's i've linked in this thread and the code changes they make. |
Yes, the next preview build should address the issue with slowness when retrieving large data. However, the memory usage will vary based on how large of a string you want loaded. Feel free to continue tracking this issue with #593 |
Have tried same 150M data with 6.1-preview1. Same in sync mode |
Ok. Can you provide a reproduction of your exact use-case please? I'd consider the new test being added here https://github.com/dotnet/SqlClient/pull/3337/files#diff-f4c04283a867c8e709055a97e591759044146af3aa616101f1cbb53eca382f39 a reasonable template. |
Next is enough for reproduction both long execution & ignoring cancelationToken
|
I took your reproduction above and made a class which can be used with benchmark.net, that requires small changes but the work done remains the same. I then benchmarked it for 5 iterations using a local sql server instance. Cpu profiling shows this information: I've highlighted the Dispatch node because that's the one that contains the work for this benchmark, if you drill down several layers you get to the Filtering to just that working time we get this: where you can see that of all the time spent the majority is taken by copying from one place in memory to another and in a framework method. From my perspective this is quite good because SqlClient code is taking as little time as it can to do calculations and getting the required work done as soon as possible. We're spending most of the time waiting for packets of data from the server. As soon as we get them we take the data and then ask for more data. This is what I want to see in profiles. The speed is based on how fast we can get data so it will be affected by latency. The time taken on this profile was 1283 seconds, which is ~22 minutes. That's for 5 iterations. so each iteration took ~5:15 which does feel a bit long but it's 5 times faster than your measured time of 21 minutes for a single iteration so I have to assume you're using a data source with latency like an offsite sql database. Next, a memory profile. I used dotTrace and found that there was around 3TB of memory traffic being used which is far too much but it was being attributed to TryReadSqlValue which doesn't allocate so I wasn't getting enough information. So I switch to the built in memory profiler in VS which has an allocation tracing mode. The traffic is largely in char[]'s and it's coming from TryReadPlpUnicodeChars which makes sense because that's the place where the data is read from the packet. But why so much traffic? This example is using a varchar(max) which means it's being read using plp and that means that the code may not know the final size ahead of time. The current structure of the code tries to work out if it can allocate a large enough buffer but if it can't it then falls back to allocating a buffer of exactly the size it needs for the existing data and the new data, copies the existing data in, copies the new data in, returns it. Then you repeat that for each packet. This explain the amount of time spent in copying and the memory traffic. Now I need to investigate to see if I can safely allocate the target buffer less often. The obvious solution to this is usually to double the existing buffer size but in constrained environments like docker containers doing that can easily hi out of memory conditions. @benrr101 can we re-open and assign this to me please? |
Thank you for your efforts. 21min it was experiment for 150Mb data. In my case all is local SQLServer 2022 Dev edition 16.0.1135.2, VS2022 latest |
Good. Equal results between the us both means I'm looking at and measuring the right things. I've reduced that time to ~800ms locally, it needs #3337 merged first for correctness. |
Describe the bug
In app I am developing happens error as result it stored several huge strings in nvarchar(max) column - 150Mb json.
Then trying to read such record with DataReader application starts consume huge amount of memory and do not react to cancellation attempts using cancellationToken.
It looks like this

On picture resource consumption by App with only 1 operation running - reading this row.
Trying to read 1 row with 150Mb column it consume up to 10G of mem.
Let assume app receive 150Mb from DB, and then convert it to 300Mb UTF16 string
What for it consume other 9.5Gb?
Data was saved to DB with same client, quick and with adequate resource consumption.
Trying to debug next state can be observed:

In our app code execution stopped on await _reader.ReadAsync(cancellationToken)
and cancellationToken is activated by closing web page.
At this time control is in TP Worker thread processing read request
There some cancellationToken token checked only at the beginning of operation and seems that is another token, as it is not activated.
Most of the time thread can be stopped in next methods called from SqlDataReader:
Expected behavior
The text was updated successfully, but these errors were encountered: