Skip to content

SqlClient 6.0.1 DataReader hangs on trying to read 1 row with 150Mb value in nvarchar(max) column - consume huge amount of memory, and do not react to cancelationToken #3274

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
AlexanderKot opened this issue Apr 12, 2025 · 15 comments
Labels
Triage Needed 🆕 For new issues, not triaged yet.

Comments

@AlexanderKot
Copy link

Describe the bug

In app I am developing happens error as result it stored several huge strings in nvarchar(max) column - 150Mb json.
Then trying to read such record with DataReader application starts consume huge amount of memory and do not react to cancellation attempts using cancellationToken.

It looks like this
Image
On picture resource consumption by App with only 1 operation running - reading this row.
Trying to read 1 row with 150Mb column it consume up to 10G of mem.

Let assume app receive 150Mb from DB, and then convert it to 300Mb UTF16 string
What for it consume other 9.5Gb?

Data was saved to DB with same client, quick and with adequate resource consumption.

Trying to debug next state can be observed:
Image

In our app code execution stopped on await _reader.ReadAsync(cancellationToken)
and cancellationToken is activated by closing web page.

At this time control is in TP Worker thread processing read request

Image
There some cancellationToken token checked only at the beginning of operation and seems that is another token, as it is not activated.

Most of the time thread can be stopped in next methods called from SqlDataReader:

Image

Image

Expected behavior

  1. React to cancellation token
  2. Do read operation with adequate memory consumption.
@Wraith2
Copy link
Contributor

Wraith2 commented Apr 14, 2025

You're hitting #593
Speed and memory consumption should be improved for large strings by #3161 which will be in the next preview release.

@cheenamalhotra cheenamalhotra added the Triage Needed 🆕 For new issues, not triaged yet. label Apr 16, 2025
@AlexanderKot
Copy link
Author

I have tried do same read operation synchronously.
Instead of 25 minutes without success it do read under debugger for ~1-2 sec.
Image

But memory consumption by this operation is inadequate also.

Image

Image

159Mb UTF8 string loaded from DB and converted to 318Mb UTF16 .net string, it is not good for app, but this behavior predicted and logic.
At the same time during this read operation SqlClient allocated 1Gb of buffers using SharedArrayPool.

Image
It looks like some code request char array from pool for creating this huge string, without knowledge of final string size, and then request double size buffer, then again up to 536M. Do not know how long arrays can leave in pool, but left for night app releases array pool objects next day.

Hope current usage of ArrayPool is also improved by mentioned PR.

PS
Some question arise when exploring this behavior:
How async operation can be so inefficient and dangerous in current implementation?
How it can happen, that such problem known for practically 5 years and not fixed?
It is not OSS project but main driver for MSSQL recommended by MS to be used.
159Mb string is error for our app, but not something unreal for DB usage scenario, as well as usage of async operation when querying DB.

All this is looks like good material for @stephentoub articles about performance improvements.

@Wraith2
Copy link
Contributor

Wraith2 commented Apr 16, 2025

performance and memory usage change slighlty. see benchmarks in #3161 (comment) for some information.
The memory usage is never going to be a lot better while reading into a string. The best I can ever do is 2x string length + some overhead. It is currently more like 3x + overhead.

How async operation can be so inefficient and dangerous in current implementation?

Storing large strings isn't a good use case for a relational database. If you want to do that there are many Document database options. The async performance was always poor because it used a replay mechanism. Why? I don't know.

How it can happen, that such problem known for practically 5 years and not fixed?

It's really really hard to fix without breaking anyone or changing behaviour. Look through #2690, #2714 and #3161 for how much work it took.

It is not OSS project but main driver for MSSQL recommended by MS to be used.

This is the recommended and supported driver for connecting to Sql Server from .net apps. It is open source. From the time that it became available for contributors to improve as part of netcore people have been improving it. It's a complex library with few contributors and very high compatibility requirements.

159Mb string is error for our app, but not something unreal for DB usage scenario, as well as usage of async operation when querying DB.

If you want to read large amounts of text data from the database there are better ways to do it than to read into a string. Consider using streaming support instead https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sqlclient-streaming-support

@AlexanderKot
Copy link
Author

AlexanderKot commented Apr 16, 2025

I have rebuild app with System.Data.SqlClient just for experiment.

At least Microsoft.Data.SqlClient has better behavior in async mode then old client :)
Both hangs and consume resources but new one do it a little bit slower.

That is behavior in async mode - not succeeded to see result of reader, as well as no reaction to cancellationToken.
Image
159/320Mb of problem string -> 23G of memory consumed trying to read it without success.

Something interesting can be observed in parallel stack during this read
Image

Behavior in sync mode
Image
At least there GC release memory as soon as huge string is out of scope.

Storing large strings isn't a good use case for a relational database

I know and absolutely agree. For our app such situation happened because of already fixed error.
But if something like this happens, app starts consuming resources, and even if we have some protection by possible cancellation, nothing can be done, cancellation is not respected by SqlClient.

This is the recommended and supported driver for connecting to Sql Server from .net apps. It is open source.

Sorry i was not enough clear.
Yes it OSS. But it is not project of some 3d party enthusiasts, but official client developed by DB vendor. And this client(s) allow to make apps depending on it not-usable and consuming resources, in situation when nothing extraordinary happened. It can read this value and convert it to string ,that is shown by sync mode. Async must do the same. Especially when async mode was always promoted by MS, as way for making ASP.NET MVC apps more responsive and reliable, by returning worker thread's back to pool for serving other parallel requests.
What we can observe there - NO profit from async, only overhead from it usage + serious potential problem. And this situation, as I can understood, exists for many years, known and nothing was done.

In any case
Thank you for current fix, waiting to try it with this problem scenario.

@Wraith2
Copy link
Contributor

Wraith2 commented Apr 16, 2025

This is a pathological case and the longer the string the worse it gets because the async multi-packet string behaviour is factorial. Most people will not see anywhere near that performance problem because storing 100Mib strings in a database is quite rare. Combine that with it being fairly easy to work around means it may not have been reported by many people and the people that do report it don't sustain pressure.

@AlexanderKot
Copy link
Author

Can you please check if cancellationToken checked time to time when reader do it's work?
For example in case of multi-packet response from DB on receiving each packet, before processing this packet.

For me it hardly to imagine how multi-packet processing must be implemented to have factorial complexity.
based on common sense it seem to be something like this:
a. DB send back packets approximately of the same size;
b. On receiving each packet runtime resume some async continuation to process packet;
c. Hope it is possible based on packet data understood if it is last packet or not;
d. If not last packet, just request buffer from pool for making simple copy of partial data,
if packets size stable such buffer's will also have stable size with some overhead because of SharedPool implementation. Ref to buffer placed in some list. When buffer copied await next packet.
(There can be mem usage up to 2*[data size] in worst case, when data in each packet 1 byte more than some SharedPool buffer size, and pool allocates buffer of next doubled size. )
e. On receiving last packet - reuse same code as used now in sync mode for converting raw data to final string with doing conversion to UTF16. In ideal case this code must use some low-level mechanism for constructing final string without additional intermediate buffers, or at least something based on String.Create()

Complexity of such algorithm will be O(data size), and mem overhead 1..2*[data size] if last step can do conversion without intermediate buffer, or 1..4*[data size] if all intermediate buffers to be copied in some additional final buffer.

Where i am wrong?

@Wraith2
Copy link
Contributor

Wraith2 commented Apr 17, 2025

That looks like how it should work, but that isn't how it currently works in released code. In the next release with #3161 it will continue instead of replay from the beginning each time it receives a packet.

It's known that it wasn't working well and that is why it has been changed. For all the details read the PR's i've linked in this thread and the code changes they make.

@benrr101
Copy link
Contributor

Yes, the next preview build should address the issue with slowness when retrieving large data. However, the memory usage will vary based on how large of a string you want loaded.

Feel free to continue tracking this issue with #593

@AlexanderKot
Copy link
Author

Have tried same 150M data with 6.1-preview1.
No significant change detected.
Process hangs eating memory and do not respect cancellationToken.
Not succeeded to wait async operation competition.
Picture in process explorer has some difference - that seems main effect from #3161 in our case.
Image

Same in sync mode

Image
~1.2Gb used

@Wraith2
Copy link
Contributor

Wraith2 commented May 9, 2025

Ok. Can you provide a reproduction of your exact use-case please? I'd consider the new test being added here https://github.com/dotnet/SqlClient/pull/3337/files#diff-f4c04283a867c8e709055a97e591759044146af3aa616101f1cbb53eca382f39 a reasonable template.

@AlexanderKot
Copy link
Author

Next is enough for reproduction both long execution & ignoring cancelationToken
With 50M string - it is possible to wait for async execution will end

Image

using System;
using System.Data;
using System.Security.Cryptography;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Data.SqlClient;

internal class Program
{
	public static void Main(string[] args)
	{
		Console.WriteLine("Big nvarchar(max) test");

		const int LEN = 1024*1024*50;

		string data = RandomNumberGenerator.GetString(
			choices: "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789",
			length: LEN
		); 

		var tableName = "TT";
		using var connection = new SqlConnection("Data Source=localhost;Initial Catalog=test;Encrypt=False;Integrated Security=SSPI;");
		connection.Open();

		using var createCommand = connection.CreateCommand();

		createCommand.CommandText
			= $"""
			   DROP TABLE IF EXISTS [{tableName}]
			   CREATE TABLE [{tableName}] (Id INT PRIMARY KEY, Data NVARCHAR(MAX));
			   INSERT INTO [{tableName}] (Id, Data) VALUES (1, @data);
			   """;
		createCommand.Parameters.Add(new SqlParameter("@data", SqlDbType.NVarChar) { Value = data });

		createCommand.ExecuteNonQuery();


		using var cmd1 = connection.CreateCommand();
		cmd1.CommandText = $"SELECT Id, Data FROM [{tableName}]";

		using var rdr1 = cmd1.ExecuteReader();
		rdr1.Read();
		var str1 = rdr1.GetString(1);
		rdr1.Close();

		var ctsTs = new CancellationTokenSource(TimeSpan.FromSeconds(10));

		var taskCheck = Task.Run(() => ReaderAsync(cmd1, ctsTs.Token));
		try
		{
			taskCheck.GetAwaiter().GetResult();
		}
		catch (TaskCanceledException e)
		{
			Console.WriteLine(e);
			throw;
		}
		catch (Exception e)
		{
			Console.WriteLine(e);
		}
	}

	private static async Task ReaderAsync(SqlCommand cmd, CancellationToken cancellationToken)
	{
		using var rdr1 = await cmd.ExecuteReaderAsync(cancellationToken).ConfigureAwait(false);

		await rdr1.ReadAsync(cancellationToken).ConfigureAwait(false);;
		var str1 = rdr1.GetString(1);
	}
}

@AlexanderKot
Copy link
Author

Also succeeded waiting 150mb - 21 min
Image

@Wraith2
Copy link
Contributor

Wraith2 commented May 11, 2025

I took your reproduction above and made a class which can be used with benchmark.net, that requires small changes but the work done remains the same. I then benchmarked it for 5 iterations using a local sql server instance. Cpu profiling shows this information:

Image

I've highlighted the Dispatch node because that's the one that contains the work for this benchmark, if you drill down several layers you get to the ContinueAsyncCall which is where the read logic picks up. If you look at the percentages you'll see that only 10% of the time spent doing work was taken doing the work that we want, the addition of packet data to the output.

Filtering to just that working time we get this:

Image

where you can see that of all the time spent the majority is taken by copying from one place in memory to another and in a framework method. From my perspective this is quite good because SqlClient code is taking as little time as it can to do calculations and getting the required work done as soon as possible.

We're spending most of the time waiting for packets of data from the server. As soon as we get them we take the data and then ask for more data. This is what I want to see in profiles. The speed is based on how fast we can get data so it will be affected by latency. The time taken on this profile was 1283 seconds, which is ~22 minutes. That's for 5 iterations. so each iteration took ~5:15 which does feel a bit long but it's 5 times faster than your measured time of 21 minutes for a single iteration so I have to assume you're using a data source with latency like an offsite sql database.

Next, a memory profile. I used dotTrace and found that there was around 3TB of memory traffic being used which is far too much but it was being attributed to TryReadSqlValue which doesn't allocate so I wasn't getting enough information. So I switch to the built in memory profiler in VS which has an allocation tracing mode.

Image

The traffic is largely in char[]'s and it's coming from TryReadPlpUnicodeChars which makes sense because that's the place where the data is read from the packet. But why so much traffic? This example is using a varchar(max) which means it's being read using plp and that means that the code may not know the final size ahead of time. The current structure of the code tries to work out if it can allocate a large enough buffer but if it can't it then falls back to allocating a buffer of exactly the size it needs for the existing data and the new data, copies the existing data in, copies the new data in, returns it. Then you repeat that for each packet. This explain the amount of time spent in copying and the memory traffic.

Now I need to investigate to see if I can safely allocate the target buffer less often. The obvious solution to this is usually to double the existing buffer size but in constrained environments like docker containers doing that can easily hi out of memory conditions.

@benrr101 can we re-open and assign this to me please?

@AlexanderKot
Copy link
Author

AlexanderKot commented May 11, 2025

Thank you for your efforts.

21min it was experiment for 150Mb data.
~5min for 50M

In my case all is local SQLServer 2022 Dev edition 16.0.1135.2, VS2022 latest
CPU 12th Gen Intel(R) Core(TM) i7-12700H 2.30 GHz
RAM 32G

@Wraith2
Copy link
Contributor

Wraith2 commented May 11, 2025

Good. Equal results between the us both means I'm looking at and measuring the right things. I've reduced that time to ~800ms locally, it needs #3337 merged first for correctness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Triage Needed 🆕 For new issues, not triaged yet.
Projects
None yet
Development

No branches or pull requests

4 participants