KMeansTrainer.Options NumberOfThreads has no effect? Low CPU load during .Fit()

**System Information:**
 - Windows 11 22H2
 - ML.NET Version: Microsoft.ML 4.0.2
 - .NET Version: .NET 9.0

**Describe the bug**
I have a large dataset with approx. 500,000 spectras with 200 data points each. I use the kMeansTrainer for the clustering. It works - but during the "training" (which takes ~75 s) ...

```csharp
var model = pipeline.Fit(trainingData);
```
... the CPU load is less than 20% on a i7-11850H @ 2.5 GHz CPU with 16 threads. If I got the ML.Net code right, it allows max. 16 / 2 = 8 threads. But I can set the `NumberOfThreads` in the trainer options from 1 to 8 - it doesn't change anything.

```csharp
            var options = new KMeansTrainer.Options
            {
                NumberOfClusters = numberOfClusters,
                // InitializationAlgorithm = KMeansTrainer.InitializationAlgorithm.KMeansYinyang,
                // OptimizationTolerance = 1e-7f, // default 1e-7f
                NumberOfThreads = 8,  // default 8                
                // MaximumNumberOfIterations = 1000
            };
            var pipeline = mlContext.Clustering.Trainers.KMeans(options);
```
The consumed time doesn't change and the CPU load is also low.

If I reduce e.g. the `MaximumNumberOfIterations` or the `OptimizationTolerance`  I can see an effect on the training time (and on the results).

For the normalization of the data I use Parallel.For loops and during this loops, I can clearly see that all thread ramp up to ~80% load for a short time. 

If I dig into the code: [KMeansLloydsYinYangTrain.Train](https://github.com/dotnet/machinelearning/blob/aea5651f13e18c4387266e830224abe25f7e9bf1/src/Microsoft.ML.KMeansClustering/KMeansPlusPlusTrainer.cs#L1351)  in every iteration the [chunks should be processes in parallel](https://github.com/dotnet/machinelearning/blob/aea5651f13e18c4387266e830224abe25f7e9bf1/src/Microsoft.ML.KMeansClustering/KMeansPlusPlusTrainer.cs#L1394), right? If I select the maximum of the 8 threads, I would expect that 8 threads should run at max. load.

**Expected behavior**
I expect that the KMeans training should use the maximum allowed CPU power in order to reduce time.

**Additional context**
We compared our C# app with MatLab kMeans with the same data: ML.Net is faster than the MatLab implementation, but if I see that the cpu has 80 % idle time during the training, it could be even faster. ;)


Thanks,
Mike

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

KMeansTrainer.Options NumberOfThreads has no effect? Low CPU load during .Fit() #7477

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

KMeansTrainer.Options NumberOfThreads has no effect? Low CPU load during .Fit() #7477

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions