Skip to content

Use reference counting on factories #2048

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 9, 2025

Conversation

RossBrunton
Copy link
Contributor

Previously the factories used by ur_ldrddi (used when there are multiple backends) would add newly created objects to a map, but never release them. This patch adds reference counting semantics to the allocation, retention and release methods.

No tests were added; this should be covered by tests for urThingRetain.

Closes: #1784 .

@github-actions github-actions bot added loader Loader related feature/bug common Changes or additions to common utilities labels Sep 4, 2024
@RossBrunton
Copy link
Contributor Author

Currently as a draft as I want to run some performance tests to see if this affects performance at all.

Copy link
Contributor

github-actions bot commented Sep 4, 2024

Compute Benchmarks level_zero run (with params: ):
https://github.com/oneapi-src/unified-runtime/actions/runs/10705862942

Copy link
Contributor

github-actions bot commented Sep 4, 2024

Compute Benchmarks level_zero run ():
https://github.com/oneapi-src/unified-runtime/actions/runs/10705862942
Job status: success. Test status: success.

Summary

result is better

Benchmark This PR baseline
api_overhead_benchmark_sycl SubmitKernel out of order 48.64 50.631
api_overhead_benchmark_sycl SubmitKernel in order 46.76 49.385
api_overhead_benchmark_ur SubmitKernel out of order 31.176 31.93
api_overhead_benchmark_ur SubmitKernel in order 25.163 28.586
memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024 415.26 423.457
memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024 206.845 253.906
memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024 9.95 9.179
memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240 1.903 1.854
api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024 4.598 4.506
api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024 3.575 3.613
miscellaneous_benchmark_sycl VectorSum 862.055 863.651
Velocity-Bench Hashtable 243.842459 178.291413
Velocity-Bench Bitcracker 35.7031 35.8407
Velocity-Bench CudaSift 280.397 283.294
Velocity-Bench Easywave 457 457.0
Velocity-Bench QuickSilver 116.11 115.63
Velocity-Bench Sobel Filter 798.662 934.963

Charts

api_overhead_benchmark_sycl SubmitKernel out of order
---
config:
    gantt:
        rightPadding: 10
        leftPadding: 120
        sectionFontSize: 10
        numberSectionStyles: 2
---
gantt
    title api_overhead_benchmark_sycl SubmitKernel out of order
    todayMarker off
    dateFormat  X
    axisFormat %s

    section SubmitKernel(api=sycl<br>Profiling=0<br>Ioq=0<br>DiscardEvents=0<br>NumKernels=10<br>KernelExecTime=1<br>MeasureCompletion=0)

        This PR (48.64 μs)   : crit, 0, 48

        baseline (50.631 μs)   :  0, 50

    -   : 0, 0

    -   : 0, 0

Loading
api_overhead_benchmark_sycl SubmitKernel in order
---
config:
    gantt:
        rightPadding: 10
        leftPadding: 120
        sectionFontSize: 10
        numberSectionStyles: 2
---
gantt
    title api_overhead_benchmark_sycl SubmitKernel in order
    todayMarker off
    dateFormat  X
    axisFormat %s

    section SubmitKernel(api=sycl<br>Profiling=0<br>Ioq=1<br>DiscardEvents=0<br>NumKernels=10<br>KernelExecTime=1<br>MeasureCompletion=0)

        This PR (46.76 μs)   : crit, 0, 46

        baseline (49.385 μs)   :  0, 49

    -   : 0, 0

    -   : 0, 0

Loading
api_overhead_benchmark_ur SubmitKernel out of order
---
config:
    gantt:
        rightPadding: 10
        leftPadding: 120
        sectionFontSize: 10
        numberSectionStyles: 2
---
gantt
    title api_overhead_benchmark_ur SubmitKernel out of order
    todayMarker off
    dateFormat  X
    axisFormat %s

    section SubmitKernel(api=ur<br>Profiling=0<br>Ioq=0<br>DiscardEvents=0<br>NumKernels=10<br>KernelExecTime=1<br>MeasureCompletion=0)

        This PR (31.176 μs)   : crit, 0, 31

        baseline (31.93 μs)   :  0, 31

    -   : 0, 0

    -   : 0, 0

Loading
api_overhead_benchmark_ur SubmitKernel in order
---
config:
    gantt:
        rightPadding: 10
        leftPadding: 120
        sectionFontSize: 10
        numberSectionStyles: 2
---
gantt
    title api_overhead_benchmark_ur SubmitKernel in order
    todayMarker off
    dateFormat  X
    axisFormat %s

    section SubmitKernel(api=ur<br>Profiling=0<br>Ioq=1<br>DiscardEvents=0<br>NumKernels=10<br>KernelExecTime=1<br>MeasureCompletion=0)

        This PR (25.163 μs)   : crit, 0, 25

        baseline (28.586 μs)   :  0, 28

    -   : 0, 0

    -   : 0, 0

Loading
memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024
---
config:
    gantt:
        rightPadding: 10
        leftPadding: 120
        sectionFontSize: 10
        numberSectionStyles: 2
---
gantt
    title memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024
    todayMarker off
    dateFormat  X
    axisFormat %s

    section QueueInOrderMemcpy(api=sycl<br>IsCopyOnly=0<br>sourcePlacement=Device<br>destinationPlacement=Device<br>size=1KB<br>count=100)

        This PR (415.26 μs)   : crit, 0, 415

        baseline (423.457 μs)   :  0, 423

    -   : 0, 0

    -   : 0, 0

Loading
memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024
---
config:
    gantt:
        rightPadding: 10
        leftPadding: 120
        sectionFontSize: 10
        numberSectionStyles: 2
---
gantt
    title memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024
    todayMarker off
    dateFormat  X
    axisFormat %s

    section QueueInOrderMemcpy(api=sycl<br>IsCopyOnly=0<br>sourcePlacement=Host<br>destinationPlacement=Device<br>size=1KB<br>count=100)

        This PR (206.845 μs)   : crit, 0, 206

        baseline (253.906 μs)   :  0, 253

    -   : 0, 0

    -   : 0, 0

Loading
memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024
---
config:
    gantt:
        rightPadding: 10
        leftPadding: 120
        sectionFontSize: 10
        numberSectionStyles: 2
---
gantt
    title memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024
    todayMarker off
    dateFormat  X
    axisFormat %s

    section QueueMemcpy(api=sycl<br>sourcePlacement=Device<br>destinationPlacement=Device<br>size=1KB)

        This PR (9.95 μs)   : crit, 0, 9

        baseline (9.179 μs)   :  0, 9

    -   : 0, 0

    -   : 0, 0

Loading
memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240
---
config:
    gantt:
        rightPadding: 10
        leftPadding: 120
        sectionFontSize: 10
        numberSectionStyles: 2
---
gantt
    title memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240
    todayMarker off
    dateFormat  X
    axisFormat %s

    section StreamMemory(api=sycl<br>type=Triad<br>size=10KB<br>useEvents=0<br>contents=Zeros<br>memoryPlacement=Device)

        This PR (1.903 μs)   : crit, 0, 1

        baseline (1.854 μs)   :  0, 1

    -   : 0, 0

    -   : 0, 0

Loading
api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024
---
config:
    gantt:
        rightPadding: 10
        leftPadding: 120
        sectionFontSize: 10
        numberSectionStyles: 2
---
gantt
    title api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024
    todayMarker off
    dateFormat  X
    axisFormat %s

    section ExecImmediateCopyQueue(api=sycl<br>IsCopyOnly=1<br>MeasureCompletionTime=0<br>src=Device<br>dst=Device<br>size=1KB<br>ioq=0)

        This PR (4.598 μs)   : crit, 0, 4

        baseline (4.506 μs)   :  0, 4

    -   : 0, 0

    -   : 0, 0

Loading
api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024
---
config:
    gantt:
        rightPadding: 10
        leftPadding: 120
        sectionFontSize: 10
        numberSectionStyles: 2
---
gantt
    title api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024
    todayMarker off
    dateFormat  X
    axisFormat %s

    section ExecImmediateCopyQueue(api=sycl<br>IsCopyOnly=1<br>MeasureCompletionTime=0<br>src=Host<br>dst=Host<br>size=1KB<br>ioq=1)

        This PR (3.575 μs)   : crit, 0, 3

        baseline (3.613 μs)   :  0, 3

    -   : 0, 0

    -   : 0, 0

Loading
miscellaneous_benchmark_sycl VectorSum
---
config:
    gantt:
        rightPadding: 10
        leftPadding: 120
        sectionFontSize: 10
        numberSectionStyles: 2
---
gantt
    title miscellaneous_benchmark_sycl VectorSum
    todayMarker off
    dateFormat  X
    axisFormat %s

    section VectorSum(api=sycl<br>numberOfElementsX=512<br>numberOfElementsY=256<br>numberOfElementsZ=256)

        This PR (862.055 μs)   : crit, 0, 862

        baseline (863.651 μs)   :  0, 863

    -   : 0, 0

    -   : 0, 0

Loading
Velocity-Bench Hashtable
---
config:
    gantt:
        rightPadding: 10
        leftPadding: 120
        sectionFontSize: 10
        numberSectionStyles: 2
---
gantt
    title Velocity-Bench Hashtable
    todayMarker off
    dateFormat  X
    axisFormat %s

    section hashtable

        This PR (243.842459 M keys/sec)   : crit, 0, 243

        baseline (178.291413 M keys/sec)   :  0, 178

    -   : 0, 0

    -   : 0, 0

Loading
Velocity-Bench Bitcracker
---
config:
    gantt:
        rightPadding: 10
        leftPadding: 120
        sectionFontSize: 10
        numberSectionStyles: 2
---
gantt
    title Velocity-Bench Bitcracker
    todayMarker off
    dateFormat  X
    axisFormat %s

    section bitcracker

        This PR (35.7031 s)   : crit, 0, 35

        baseline (35.8407 s)   :  0, 35

    -   : 0, 0

    -   : 0, 0

Loading
Velocity-Bench CudaSift
---
config:
    gantt:
        rightPadding: 10
        leftPadding: 120
        sectionFontSize: 10
        numberSectionStyles: 2
---
gantt
    title Velocity-Bench CudaSift
    todayMarker off
    dateFormat  X
    axisFormat %s

    section cudaSift

        This PR (280.397 ms)   : crit, 0, 280

        baseline (283.294 ms)   :  0, 283

    -   : 0, 0

    -   : 0, 0

Loading
Velocity-Bench Easywave
---
config:
    gantt:
        rightPadding: 10
        leftPadding: 120
        sectionFontSize: 10
        numberSectionStyles: 2
---
gantt
    title Velocity-Bench Easywave
    todayMarker off
    dateFormat  X
    axisFormat %s

    section easywave

        This PR (457 ms)   : crit, 0, 457

        baseline (457.0 ms)   :  0, 457

    -   : 0, 0

    -   : 0, 0

Loading
Velocity-Bench QuickSilver
---
config:
    gantt:
        rightPadding: 10
        leftPadding: 120
        sectionFontSize: 10
        numberSectionStyles: 2
---
gantt
    title Velocity-Bench QuickSilver
    todayMarker off
    dateFormat  X
    axisFormat %s

    section QuickSilver

        This PR (116.11 MMS/CTT)   : crit, 0, 116

        baseline (115.63 MMS/CTT)   :  0, 115

    -   : 0, 0

    -   : 0, 0

Loading
Velocity-Bench Sobel Filter
---
config:
    gantt:
        rightPadding: 10
        leftPadding: 120
        sectionFontSize: 10
        numberSectionStyles: 2
---
gantt
    title Velocity-Bench Sobel Filter
    todayMarker off
    dateFormat  X
    axisFormat %s

    section sobel_filter

        This PR (798.662 ms)   : crit, 0, 798

        baseline (934.963 ms)   :  0, 934

    -   : 0, 0

    -   : 0, 0

Loading

Details

SubmitKernel(api=sycl Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0)

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),48.640,47.916,7.52%,45.954,522.877,[CPU],[us]

SubmitKernel(api=sycl Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0)

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),46.760,46.319,7.53%,44.195,607.888,[CPU],[us]

SubmitKernel(api=ur Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0)

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=ur Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),31.176,30.721,8.20%,29.235,564.821,[CPU],[us]

SubmitKernel(api=ur Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0)

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=ur Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),25.163,29.258,29.71%,12.912,539.681,[CPU],[us]

QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Device destinationPlacement=Device size=1KB count=100)

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Device --destinationPlacement=Device --size=1024 --count=100

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Device destinationPlacement=Device size=1KB count=100),415.260,478.308,25.29%,224.724,1063.758,[CPU],[us]

QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Host destinationPlacement=Device size=1KB count=100)

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Host --destinationPlacement=Device --size=1024 --count=100

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Host destinationPlacement=Device size=1KB count=100),206.845,218.853,40.96%,112.767,1062.518,[CPU],[us]

QueueMemcpy(api=sycl sourcePlacement=Device destinationPlacement=Device size=1KB)

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueMemcpy --csv --noHeaders --iterations=10000 --sourcePlacement=Device --destinationPlacement=Device --size=1024

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
QueueMemcpy(api=sycl sourcePlacement=Device destinationPlacement=Device size=1KB),9.950,9.815,18.57%,7.735,139.169,[CPU],[us]

StreamMemory(api=sycl type=Triad size=10KB useEvents=0 contents=Zeros memoryPlacement=Device)

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=StreamMemory --csv --noHeaders --iterations=10000 --type=Triad --size=10240 --memoryPlacement=Device --useEvents=0 --contents=Zeros

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
StreamMemory(api=sycl type=Triad size=10KB useEvents=0 contents=Zeros memoryPlacement=Device),1.903,1.916,3.57%,0.212,2.036,[CPU],[GB/s]

ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Device dst=Device size=1KB ioq=0)

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=0 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Device --dst=Device --size=1024

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Device dst=Device size=1KB ioq=0),4.598,4.475,17.93%,4.028,191.319,[CPU],[us]

ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Host dst=Host size=1KB ioq=1)

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=1 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Host --dst=Host --size=1024

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Host dst=Host size=1KB ioq=1),3.575,3.439,17.24%,3.249,97.405,[CPU],[us]

VectorSum(api=sycl numberOfElementsX=512 numberOfElementsY=256 numberOfElementsZ=256)

Environment Variables:

Command:

/home/test-user/bench_workdir/compute-benchmarks-build/bin/miscellaneous_benchmark_sycl --test=VectorSum --csv --noHeaders --iterations=1000 --numberOfElementsX=512 --numberOfElementsY=256 --numberOfElementsZ=256

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
VectorSum(api=sycl numberOfElementsX=512 numberOfElementsY=256 numberOfElementsZ=256),862.055,862.434,0.45%,825.380,888.310,[GPU],bw [GB/s]

hashtable

Environment Variables:

Command:

/home/test-user/bench_workdir/hashtable/hashtable_sycl --no-verify

Output:

hashtable - total time for whole calculation: 0.550428 s
243.842459 million keys/second

bitcracker

Environment Variables:

Command:

/home/test-user/bench_workdir/bitcracker/bitcracker -f /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/img_win8_user_hash.txt -d /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/user_passwords_60000.txt -b 60000

Output:

---------> BitCracker: BitLocker password cracking tool <---------

==================================
Retrieving Info

Reading hash file "/home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/img_win8_user_hash.txt"

              Attack

================================================
Type of attack: User Password
Psw per thread: 1
max_num_pswd_per_read: 60000
Dictionary: /home/test-user/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/user_passwords_60000.txt
MAC Comparison (-m): Yes

Iter: 1, num passwords read: 60000
Kernel execution:
Effective passwords: 60000
Passwords Range:
npknpByH7N2m3OnLNH1X9DJxLrzIFWk
.....
dL_7uuf3QCz-c6K3xDu0

================================================
Bitcracker attack completed
Total passwords evaluated: 60000
Password not found!

time to subtract from total: 0.0152895 s
bitcracker - total time for whole calculation: 35.7031 s

cudaSift

Environment Variables:

Command:

/home/test-user/bench_workdir/cudaSift/cudaSift

Output:

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1212 1260 32.908% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1201 1256 32.6093% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1229 1262 33.3695% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1103 1266 29.9484% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1231 1264 33.4238% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1198 1252 32.5278% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1164 1261 31.6047% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1043 1279 28.3193% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1219 1264 33.098% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1120 1261 30.41% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1067 1260 28.9709% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1229 1264 33.3695% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1105 1266 30.0027% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1083 1271 29.4054% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1225 1258 33.2609% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1234 1269 33.5053% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1186 1279 32.202% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1227 1259 33.3152% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1096 1275 29.7583% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1094 1264 29.704% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1234 1267 33.5053% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1153 1259 31.306% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1097 1268 29.7855% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1109 1263 30.1113% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1107 1262 30.057% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1135 1264 30.8173% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1231 1267 33.4238% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1241 1271 33.6954% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1116 1271 30.3014% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1087 1260 29.514% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1125 1267 30.5458% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1116 1270 30.3014% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1051 1267 28.5365% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1224 1259 33.2338% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1091 1260 29.6226% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1189 1278 32.2835% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1106 1276 30.0299% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1091 1264 29.6226% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1065 1263 28.9166% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1137 1267 30.8716% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1232 1266 33.451% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1231 1266 33.4238% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1109 1268 30.1113% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1126 1273 30.5729% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1224 1258 33.2338% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1238 1269 33.6139% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1127 1266 30.6001% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1164 1255 31.6047% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1235 1269 33.5324% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1131 1271 30.7087% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Avg workload time = 280.397 ms

easywave

Environment Variables:

Command:

/home/test-user/bench_workdir/easywave/easyWave_sycl -grid /home/test-user/bench_workdir/data/easywave/examples/e2Asean.grd -source /home/test-user/bench_workdir/data/easywave/examples/BengkuluSept2007.flt -time 120

Output:

MAIN: Starting SYCL main program
MAIN: Attempting to clean up previous eWave tsunami files
MAIN: Clean up completed
SYCL: SYCL Queue initialization successful
SYCL: Using SYCL device : Intel(R) Data Center GPU Max 1100 (Driver version 1.3.29735+27)
SYCL: Platform : Intel(R) oneAPI Unified Runtime over Level-Zero
MAIN: Program successfully completed

QuickSilver

Environment Variables:

QS_DEVICE=GPU

Command:

/home/test-user/bench_workdir/QuickSilver/qs -i /home/test-user/bench_workdir/velocity-bench-repo/QuickSilver/Examples/AllScattering/scatteringOnly.inp

Output:

Copyright (c) 2016
Lawrence Livermore National Security, LLC
All Rights Reserved
Quicksilver Version :
Quicksilver Git Hash :
MPI Version : 3.0
Number of MPI ranks : 1
Number of OpenMP Threads: 1
Number of OpenMP CPUs : 1

Loading params
Finished loading params
Simulation:
dt: 1e-08
fMax: 0.1
inputFile: /home/test-user/bench_workdir/velocity-bench-repo/QuickSilver/Examples/AllScattering/scatteringOnly.inp
energySpectrum:
boundaryCondition: octant
loadBalance: 1
cycleTimers: 0
debugThreads: 0
lx: 100
ly: 100
lz: 100
nParticles: 10000000
batchSize: 0
nBatches: 10
nSteps: 10
nx: 10
ny: 10
nz: 10
seed: 1029384756
xDom: 0
yDom: 0
zDom: 0
eMax: 20
eMin: 1e-09
nGroups: 230
lowWeightCutoff: 0.001
bTally: 1
fTally: 1
cTally: 1
coralBenchmark: 0
crossSectionsOut:

Geometry:
material: sourceMaterial
shape: brick
xMax: 100
xMin: 0
yMax: 100
yMin: 0
zMax: 100
zMin: 0

Material:
name: sourceMaterial
mass: 1000
nIsotopes: 10
nReactions: 9
sourceRate: 1e+10
totalCrossSection: 0.1
absorptionCrossSection: flat
fissionCrossSection: flat
scatteringCrossSection: flat
absorptionCrossSectionRatio: 0
fissionCrossSectionRatio: 0
scatteringCrossSectionRatio: 1

CrossSection:
name: flat
A: 0
B: 0
C: 0
D: 0
E: 1
nuBar: 2.4
setting GPU
setting parameters
Building partition 0
Building partition 1
Building partition 2
Building partition 3
Building MC_Domain 0
Building MC_Domain 1
Building MC_Domain 2
Building MC_Domain 3
Starting Consistency Check
Finished Consistency Check
Finished initMesh
Started copyMaterialDatabase_device
Finished copyMaterialDatabase_device
Finished copyNuclearData_device
Finished copyDomainDevice
cycle start source rr split absorb scatter fission produce collisn escape census num_seg scalar_flux cycleInit cycleTracking cycleFinalize
0 0 1000000 0 9000000 0 18533189 0 0 18533189 1151780 8848220 55527935 1.854923e+09 6.258700e-01 6.283290e-01 1.000000e-06
1 8848220 1000000 0 151478 0 34281997 0 0 34281997 1664159 8335539 94633679 5.047651e+09 5.801670e-01 7.672430e-01 0.000000e+00
2 8335539 1000000 0 663717 0 34354432 0 0 34354432 1366771 8632485 95010375 7.705930e+09 5.438190e-01 7.805580e-01 0.000000e+00
3 8632485 1000000 0 367978 0 34302727 0 0 34302727 1242216 8758247 94953591 9.992076e+09 6.018570e-01 8.557190e-01 1.000000e-06
4 8758247 1000000 0 242076 0 34141236 0 0 34141236 1168452 8831871 94599337 1.199834e+10 5.379460e-01 8.048110e-01 0.000000e+00
5 8831871 1000000 0 168070 0 33948724 0 0 33948724 1121156 8878785 94148236 1.377636e+10 3.420880e-01 7.701220e-01 0.000000e+00
6 8878785 1000000 0 120572 0 33760567 0 0 33760567 1089103 8910254 93689264 1.535668e+10 3.458810e-01 7.752290e-01 0.000000e+00
7 8910254 1000000 0 89810 0 33552179 0 0 33552179 1065203 8934861 93216931 1.676993e+10 5.314260e-01 7.995410e-01 0.000000e+00
8 8934861 1000000 0 65491 0 33384605 0 0 33384605 1047720 8952632 92768273 1.804559e+10 5.402280e-01 7.998120e-01 1.000000e-06
9 8952632 1000000 0 47165 0 33198494 0 0 33198494 1033968 8965829 92324678 1.920208e+10 5.320050e-01 7.772580e-01 0.000000e+00

Timer Cumulative Cumulative Cumulative Cumulative Cumulative Cumulative
Name number microSecs microSecs microSecs microSecs Efficiency
of calls min avg max stddev Rating
main 1 1.294e+07 1.294e+07 1.294e+07 0.000e+00 100.00
cycleInit 10 5.181e+06 5.181e+06 5.181e+06 0.000e+00 100.00
cycleTracking 10 7.759e+06 7.759e+06 7.759e+06 0.000e+00 100.00
cycleTracking_Kernel 104 4.947e+06 4.947e+06 4.947e+06 0.000e+00 100.00
cycleTracking_MPI 117 2.210e+05 2.210e+05 2.210e+05 0.000e+00 100.00
cycleTracking_Test_Done 0 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.00
cycleFinalize 20 6.740e+02 6.740e+02 6.740e+02 0.000e+00 100.00
Figure Of Merit 116.11 [Num Mega Segments / Cycle Tracking Time]

sobel_filter

Environment Variables:

OPENCV_IO_MAX_IMAGE_PIXELS=1677721600

Command:

/home/test-user/bench_workdir/sobel_filter/sobel_filter -i /home/test-user/bench_workdir/data/sobel_filter/sobel_filter_data/silverfalls_32Kx32K.png -n 5

Output:

SYMN: Welcome to the SYCL version of Sobel filter workload.
SYMN: Input image file: /home/test-user/bench_workdir/data/sobel_filter/sobel_filter_data/silverfalls_32Kx32K.png
SYMN: Launching SYCL kernel with # of iterations: 5
time to subtract from total: 15.0151 s
sobelfilter - total time for whole calculation: 0.798662 s

@RossBrunton
Copy link
Contributor Author

@pbalcer thanks for kicking off the benchmark build. Looks like at the very least there are no major regressions. I also did a few local rough tests on a few synthetic cases and didn't find significant slowdown there.

I'll undraft this and create an associated llvm MR.

@RossBrunton RossBrunton marked this pull request as ready for review September 5, 2024 10:06
@RossBrunton RossBrunton requested a review from a team as a code owner September 5, 2024 10:06
RossBrunton added a commit to RossBrunton/intel-llvm that referenced this pull request Sep 5, 2024
@pbalcer
Copy link
Contributor

pbalcer commented Sep 5, 2024

e2e tests show many regressions.

edit: the fact that all UR tests pass, while a lot of SYCL tests are failing, indiactes to me that we have a pretty big gap in tests. Maybe add one after you root cause the failures?

RossBrunton added a commit to RossBrunton/intel-llvm that referenced this pull request Sep 6, 2024
RossBrunton added a commit to RossBrunton/intel-llvm that referenced this pull request Sep 6, 2024
RossBrunton added a commit to RossBrunton/intel-llvm that referenced this pull request Sep 9, 2024
RossBrunton added a commit to RossBrunton/intel-llvm that referenced this pull request Sep 10, 2024
RossBrunton added a commit to RossBrunton/intel-llvm that referenced this pull request Sep 10, 2024
@RossBrunton RossBrunton marked this pull request as draft October 24, 2024 14:12
@github-actions github-actions bot added ci/cd Continuous integration/devliery conformance Conformance test suite issues. labels Nov 5, 2024
@RossBrunton RossBrunton force-pushed the ross/refc branch 2 times, most recently from 16eea0a to 972e943 Compare November 6, 2024 11:08
RossBrunton added a commit to intel/llvm that referenced this pull request Nov 14, 2024
RossBrunton added a commit to RossBrunton/intel-llvm that referenced this pull request Dec 3, 2024
RossBrunton added a commit to RossBrunton/intel-llvm that referenced this pull request Dec 9, 2024
Copy link
Contributor

github-actions bot commented Jan 6, 2025

Compute Benchmarks level_zero run (with params: ):
https://github.com/oneapi-src/unified-runtime/actions/runs/12636134854

Copy link
Contributor

github-actions bot commented Jan 6, 2025

Compute Benchmarks level_zero run ():
https://github.com/oneapi-src/unified-runtime/actions/runs/12636134854
Job status: success. Test status: success.

Summary

Total 124 benchmarks in mean.
Geomean 99.969%.
Improved 14 Regressed 10 (threshold 2.00%)

(result is better)

Performance change in benchmark groups

Relative perf in group api (9): 101.236%
Benchmark This PR baseline Relative perf Change -
api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024 1.613000 μs 1.668 μs 103.41% 3.41% ++
api_overhead_benchmark_sycl SubmitKernel out of order 23.031000 μs 23.471 μs 101.91% 1.91% .
api_overhead_benchmark_ur SubmitKernel in order 15.888000 μs 16.185 μs 101.87% 1.87% .
api_overhead_benchmark_sycl SubmitKernel in order 24.587000 μs 24.993 μs 101.65% 1.65% .
api_overhead_benchmark_l0 SubmitKernel out of order 11.208000 μs 11.363 μs 101.38% 1.38% .
api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024 2.119000 μs 2.144 μs 101.18% 1.18% .
api_overhead_benchmark_ur SubmitKernel out of order CPU count 101833.000000 instr 101833.000 instr 100.00% 0.00% .
api_overhead_benchmark_ur SubmitKernel in order CPU count 106951.000000 instr 106951.000 instr 100.00% 0.00% .
api_overhead_benchmark_ur SubmitKernel out of order 15.633 μs 15.598000 μs 99.78% -0.22% .
Relative perf in group memory (4): 100.607%
Benchmark This PR baseline Relative perf Change -
memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024 131.636000 μs 134.668 μs 102.30% 2.30% ++
memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024 5.583000 μs 5.649 μs 101.18% 1.18% .
memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240 3.183000 GB/s 3.180 GB/s 100.09% 0.09% .
memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024 252.214 μs 249.388000 μs 98.88% -1.12% .
Relative perf in group miscellaneous (1): 106.989%
Benchmark This PR baseline Relative perf Change -
miscellaneous_benchmark_sycl VectorSum 801.970000 bw GB/s 858.023 bw GB/s 106.99% 6.99% +++++
Relative perf in group multithread (10): 99.689%
Benchmark This PR baseline Relative perf Change -
multithread_benchmark_ur MemcpyExecute opsPerThread:100, numThreads:8, allocSize:102400 srcUSM:0 dstUSM:1 8604.356000 μs 8715.682 μs 101.29% 1.29% .
multithread_benchmark_ur MemcpyExecute opsPerThread:4096, numThreads:1, allocSize:1024 srcUSM:0 dstUSM:1 without events 40367.797000 μs 40575.087 μs 100.51% 0.51% .
multithread_benchmark_ur MemcpyExecute opsPerThread:400, numThreads:1, allocSize:102400 srcUSM:1 dstUSM:1 6925.213000 μs 6957.222 μs 100.46% 0.46% .
multithread_benchmark_ur MemcpyExecute opsPerThread:10, numThreads:16, allocSize:1024 srcUSM:0 dstUSM:1 1165.576000 μs 1167.773 μs 100.19% 0.19% .
multithread_benchmark_ur MemcpyExecute opsPerThread:100, numThreads:8, allocSize:102400 srcUSM:1 dstUSM:1 17365.414 μs 17338.068000 μs 99.84% -0.16% .
multithread_benchmark_ur MemcpyExecute opsPerThread:400, numThreads:1, allocSize:102400 srcUSM:0 dstUSM:1 7434.794 μs 7415.529000 μs 99.74% -0.26% .
multithread_benchmark_ur MemcpyExecute opsPerThread:400, numThreads:8, allocSize:1024 srcUSM:0 dstUSM:1 25797.526 μs 25664.225000 μs 99.48% -0.52% .
multithread_benchmark_ur MemcpyExecute opsPerThread:4096, numThreads:4, allocSize:1024 srcUSM:0 dstUSM:1 without events 112027.609 μs 111020.128000 μs 99.10% -0.90% .
multithread_benchmark_ur MemcpyExecute opsPerThread:10, numThreads:16, allocSize:1024 srcUSM:1 dstUSM:1 2053.234 μs 2026.113000 μs 98.68% -1.32% .
multithread_benchmark_ur MemcpyExecute opsPerThread:400, numThreads:8, allocSize:1024 srcUSM:1 dstUSM:1 48062.201 μs 46926.982000 μs 97.64% -2.36% --
Relative perf in group Velocity-Bench (9): 100.151%
Benchmark This PR baseline Relative perf Change -
Velocity-Bench Hashtable 381.145668 M keys/sec 375.420 M keys/sec 101.53% 1.53% .
Velocity-Bench dl-cifar 23.200100 s 23.315 s 100.50% 0.50% .
Velocity-Bench Sobel Filter 532.796000 ms 533.891 ms 100.21% 0.21% .
Velocity-Bench Bitcracker 35.094100 s 35.157 s 100.18% 0.18% .
Velocity-Bench Easywave 239.000000 ms 239.000 ms 100.00% 0.00% .
Velocity-Bench dl-mnist 2.720000 s 2.720 s 100.00% 0.00% .
Velocity-Bench CudaSift 203.186 ms 203.135000 ms 99.97% -0.03% .
Velocity-Bench QuickSilver 117.960 MMS/CTT 118.190000 MMS/CTT 99.81% -0.19% .
Velocity-Bench svm 0.136 s 0.134600 s 99.19% -0.81% .
Relative perf in group Runtime (8): 100.317%
Benchmark This PR baseline Relative perf Change -
Runtime_DAGTaskThroughput_SingleTask 1653.391000 ms 1680.412 ms 101.63% 1.63% .
Runtime_IndependentDAGTaskThroughput_NDRangeParallelFor 274.650000 ms 277.709 ms 101.11% 1.11% .
Runtime_DAGTaskThroughput_NDRangeParallelFor 1664.018000 ms 1680.302 ms 100.98% 0.98% .
Runtime_DAGTaskThroughput_BasicParallelFor 1725.546000 ms 1741.939 ms 100.95% 0.95% .
Runtime_DAGTaskThroughput_HierarchicalParallelFor 1702.714000 ms 1715.545 ms 100.75% 0.75% .
Runtime_IndependentDAGTaskThroughput_HierarchicalParallelFor 275.008000 ms 276.767 ms 100.64% 0.64% .
Runtime_IndependentDAGTaskThroughput_SingleTask 264.020 ms 259.459000 ms 98.27% -1.73% .
Runtime_IndependentDAGTaskThroughput_BasicParallelFor 279.757 ms 274.880000 ms 98.26% -1.74% .
Relative perf in group MicroBench (14): 100.088%
Benchmark This PR baseline Relative perf Change -
MicroBench_HostDeviceBandwidth_1D_H2D_Strided 4.319000 ms 4.411 ms 102.13% 2.13% +
MicroBench_HostDeviceBandwidth_2D_H2D_Contiguous 4.447000 ms 4.509 ms 101.39% 1.39% .
MicroBench_HostDeviceBandwidth_1D_H2D_Contiguous 4.419000 ms 4.441 ms 100.50% 0.50% .
MicroBench_HostDeviceBandwidth_3D_H2D_Contiguous 4.493000 ms 4.505 ms 100.27% 0.27% .
MicroBench_HostDeviceBandwidth_1D_D2H_Strided 4.614000 ms 4.623 ms 100.20% 0.20% .
MicroBench_LocalMem_fp32_4096 29.912000 ms 29.922 ms 100.03% 0.03% .
MicroBench_HostDeviceBandwidth_3D_D2H_Strided 617.337000 ms 617.487 ms 100.02% 0.02% .
MicroBench_HostDeviceBandwidth_2D_D2H_Strided 617.440000 ms 617.490 ms 100.01% 0.01% .
MicroBench_HostDeviceBandwidth_3D_D2H_Contiguous 618.166 ms 618.144000 ms 100.00% -0.00% .
MicroBench_HostDeviceBandwidth_2D_D2H_Contiguous 618.152 ms 618.098000 ms 99.99% -0.01% .
MicroBench_LocalMem_int32_4096 29.911 ms 29.857000 ms 99.82% -0.18% .
MicroBench_HostDeviceBandwidth_2D_H2D_Strided 4.572 ms 4.537000 ms 99.23% -0.77% .
MicroBench_HostDeviceBandwidth_3D_H2D_Strided 4.604 ms 4.552000 ms 98.87% -1.13% .
MicroBench_HostDeviceBandwidth_1D_D2H_Contiguous 4.640 ms 4.585000 ms 98.81% -1.19% .
Relative perf in group Pattern (10): 100.160%
Benchmark This PR baseline Relative perf Change -
Pattern_Reduction_NDRange_int32 16.870000 ms 17.108 ms 101.41% 1.41% .
Pattern_Reduction_Hierarchical_int32 16.824000 ms 16.921 ms 100.58% 0.58% .
Pattern_SegmentedReduction_Hierarchical_int32 11.597000 ms 11.606 ms 100.08% 0.08% .
Pattern_SegmentedReduction_Hierarchical_int16 11.806000 ms 11.812 ms 100.05% 0.05% .
Pattern_SegmentedReduction_NDRange_int32 2.169000 ms 2.169 ms 100.00% 0.00% .
Pattern_SegmentedReduction_Hierarchical_fp32 11.597000 ms 11.597 ms 100.00% 0.00% .
Pattern_SegmentedReduction_NDRange_int16 2.269 ms 2.268000 ms 99.96% -0.04% .
Pattern_SegmentedReduction_NDRange_fp32 2.175 ms 2.172000 ms 99.86% -0.14% .
Pattern_SegmentedReduction_Hierarchical_int64 11.791 ms 11.773000 ms 99.85% -0.15% .
Pattern_SegmentedReduction_NDRange_int64 2.345 ms 2.341000 ms 99.83% -0.17% .
Relative perf in group ScalarProduct (6): 99.655%
Benchmark This PR baseline Relative perf Change -
ScalarProduct_Hierarchical_int64 11.488000 ms 11.508 ms 100.17% 0.17% .
ScalarProduct_Hierarchical_int32 10.543 ms 10.541000 ms 99.98% -0.02% .
ScalarProduct_NDRange_fp32 3.806 ms 3.803000 ms 99.92% -0.08% .
ScalarProduct_Hierarchical_fp32 10.180 ms 10.155000 ms 99.75% -0.25% .
ScalarProduct_NDRange_int32 3.902 ms 3.890000 ms 99.69% -0.31% .
ScalarProduct_NDRange_int64 5.553 ms 5.465000 ms 98.42% -1.58% .
Relative perf in group USM (7): 97.801%
Benchmark This PR baseline Relative perf Change -
USM_Instr_Mix_fp32_device_1:1mix_no_init_no_prefetch 1.816000 ms 1.857 ms 102.26% 2.26% ++
USM_Allocation_latency_fp32_host 37.328000 ms 37.588 ms 100.70% 0.70% .
USM_Instr_Mix_fp32_host_1:1mix_with_init_no_prefetch 1.042000 ms 1.043 ms 100.10% 0.10% .
USM_Allocation_latency_fp32_device 0.069000 ms 0.069 ms 100.00% 0.00% .
USM_Instr_Mix_fp32_host_1:1mix_no_init_no_prefetch 1.202000 ms 1.202 ms 100.00% 0.00% .
USM_Instr_Mix_fp32_device_1:1mix_with_init_no_prefetch 1.710 ms 1.664000 ms 97.31% -2.69% --
USM_Allocation_latency_fp32_shared 0.075 ms 0.064000 ms 85.33% -14.67% ----------
Relative perf in group VectorAddition (3): 102.856%
Benchmark This PR baseline Relative perf Change -
VectorAddition_fp32 1.449000 ms 1.594 ms 110.01% 10.01% +++++++
VectorAddition_int32 1.599 ms 1.592000 ms 99.56% -0.44% .
VectorAddition_int64 3.090 ms 3.070000 ms 99.35% -0.65% .
Relative perf in group Polybench (3): 100.113%
Benchmark This PR baseline Relative perf Change -
Polybench_Atax 6.863000 ms 6.899 ms 100.52% 0.52% .
Polybench_2mm 1.227000 ms 1.229 ms 100.16% 0.16% .
Polybench_3mm 1.732 ms 1.726000 ms 99.65% -0.35% .
Relative perf in group Kmeans (1): 99.956%
Benchmark This PR baseline Relative perf Change -
Kmeans_fp32 16.054 ms 16.047000 ms 99.96% -0.04% .
Relative perf in group LinearRegressionCoeff (1): cannot calculate
Benchmark This PR baseline Relative perf Change -
LinearRegressionCoeff_fp32 842.146000 ms -
Relative perf in group MolecularDynamics (1): 100.000%
Benchmark This PR baseline Relative perf Change -
MolecularDynamics 0.029000 ms 0.029 ms 100.00% 0.00% .
Relative perf in group llama.cpp (6): 99.969%
Benchmark This PR baseline Relative perf Change -
llama.cpp Prompt Processing Batched 512 449.366041 token/s 445.735 token/s 100.81% 0.81% .
llama.cpp Text Generation Batched 512 62.669559 token/s 62.604 token/s 100.10% 0.10% .
llama.cpp Text Generation Batched 128 62.596607 token/s 62.566 token/s 100.05% 0.05% .
llama.cpp Text Generation Batched 256 62.550 token/s 62.613568 token/s 99.90% -0.10% .
llama.cpp Prompt Processing Batched 256 887.447 token/s 889.978071 token/s 99.72% -0.28% .
llama.cpp Prompt Processing Batched 128 806.955 token/s 813.139695 token/s 99.24% -0.76% .
Relative perf in group alloc/max (20): 99.036%
Benchmark This PR baseline Relative perf Change -
alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4 scalable_pool<os_provider> 280.251000 ns 294.519 ns 105.09% 5.09% +++
alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1 proxy_pool<os_provider> 286.407000 ns 295.670 ns 103.23% 3.23% ++
alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4 scalable_pool<os_provider> 255.465000 ns 262.656 ns 102.81% 2.81% ++
alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4 scalable_pool<os_provider> 923.541000 ns 947.128 ns 102.55% 2.55% ++
alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4 glibc 2622.280000 ns 2675.590 ns 102.03% 2.03% +
alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1 glibc 176.764000 ns 178.724 ns 101.11% 1.11% .
alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1 scalable_pool<os_provider> 211.330000 ns 213.275 ns 100.92% 0.92% .
alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1 glibc 752.119000 ns 754.356 ns 100.30% 0.30% .
alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1 scalable_pool<os_provider> 207.764 ns 207.579000 ns 99.91% -0.09% .
alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1 os_provider 188.257 ns 187.859000 ns 99.79% -0.21% .
alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4 os_provider 2018.540 ns 2014.190000 ns 99.78% -0.22% .
alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1 proxy_pool<os_provider> 256.591 ns 256.008000 ns 99.77% -0.23% .
alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1 scalable_pool<os_provider> 968.169 ns 965.028000 ns 99.68% -0.32% .
alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1 os_provider 191.853 ns 190.687000 ns 99.39% -0.61% .
alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1 glibc 719.927 ns 713.193000 ns 99.06% -0.94% .
alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4 glibc 877.239 ns 844.289000 ns 96.24% -3.76% ---
alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4 os_provider 2232.820 ns 2084.530000 ns 93.36% -6.64% -----
alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4 proxy_pool<os_provider> 3416.310 ns 3183.710000 ns 93.19% -6.81% -----
alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4 glibc 1408.080 ns 1311.860000 ns 93.17% -6.83% -----
alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4 proxy_pool<os_provider> 4213.550 ns 3824.110000 ns 90.76% -9.24% ------
Relative perf in group multiple (12): 100.063%
Benchmark This PR baseline Relative perf Change -
multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1 glibc 4124.620000 ns 4236.590 ns 102.71% 2.71% ++
multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4 proxy_pool<os_provider> 1158270.000000 ns 1181860.000 ns 102.04% 2.04% +
multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1 glibc 30428.000000 ns 31046.000 ns 102.03% 2.03% +
multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4 scalable_pool<os_provider> 71084.100000 ns 72186.500 ns 101.55% 1.55% .
multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1 scalable_pool<os_provider> 25490.700000 ns 25779.400 ns 101.13% 1.13% .
multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4 scalable_pool<os_provider> 41387.700000 ns 41730.400 ns 100.83% 0.83% .
multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4 os_provider 1169140.000000 ns 1170930.000 ns 100.15% 0.15% .
multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1 proxy_pool<os_provider> 160262.000 ns 160188.000000 ns 99.95% -0.05% .
multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1 os_provider 141142.000 ns 140152.000000 ns 99.30% -0.70% .
multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4 glibc 138250.000 ns 136636.000000 ns 98.83% -1.17% .
multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1 scalable_pool<os_provider> 15346.300 ns 15020.600000 ns 97.88% -2.12% -
multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4 glibc 31343.800 ns 29660.000000 ns 94.63% -5.37% ----

Details

Benchmark details - environment, command, output...
api_overhead_benchmark_l0 SubmitKernel out of order

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=l0 Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),11.215,11.208,2.74%,10.448,75.562,[CPU],[us]

api_overhead_benchmark_sycl SubmitKernel out of order

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),23.056,23.031,1.71%,22.264,95.705,[CPU],[us]

api_overhead_benchmark_sycl SubmitKernel in order

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=sycl Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),24.605,24.587,2.93%,23.648,225.895,[CPU],[us]

memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Device --destinationPlacement=Device --size=1024 --count=100

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Device destinationPlacement=Device size=1KB count=100),252.338,252.214,1.08%,248.030,493.337,[CPU],[us]

memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Host --destinationPlacement=Device --size=1024 --count=100

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
QueueInOrderMemcpy(api=sycl IsCopyOnly=0 sourcePlacement=Host destinationPlacement=Device size=1KB count=100),137.704,131.636,19.99%,130.201,332.876,[CPU],[us]

memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueMemcpy --csv --noHeaders --iterations=10000 --sourcePlacement=Device --destinationPlacement=Device --size=1024

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
QueueMemcpy(api=sycl sourcePlacement=Device destinationPlacement=Device size=1KB),5.777,5.583,13.65%,5.205,61.617,[CPU],[us]

memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=StreamMemory --csv --noHeaders --iterations=10000 --type=Triad --size=10240 --memoryPlacement=Device --useEvents=0 --contents=Zeros --multiplier=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
StreamMemory(api=sycl type=Triad size=10KB useEvents=0 contents=Zeros memoryPlacement=Device multiplier=1),3.174,3.183,3.07%,0.300,3.435,[CPU],[GB/s]

api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=0 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Device --dst=Device --size=1024

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Device dst=Device size=1KB ioq=0),2.124,2.119,4.60%,1.940,12.220,[CPU],[us]

api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=1 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Host --dst=Host --size=1024

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
ExecImmediateCopyQueue(api=sycl IsCopyOnly=1 MeasureCompletionTime=0 src=Host dst=Host size=1KB ioq=1),1.620,1.613,4.01%,1.528,6.447,[CPU],[us]

miscellaneous_benchmark_sycl VectorSum

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/miscellaneous_benchmark_sycl --test=VectorSum --csv --noHeaders --iterations=1000 --numberOfElementsX=512 --numberOfElementsY=256 --numberOfElementsZ=256

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
VectorSum(api=sycl numberOfElementsX=512 numberOfElementsY=256 numberOfElementsZ=256),801.176,801.970,0.53%,767.017,809.972,[GPU],bw [GB/s]

multithread_benchmark_ur MemcpyExecute opsPerThread:400, numThreads:1, allocSize:102400 srcUSM:1 dstUSM:1

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/multithread_benchmark_ur --test=MemcpyExecute --csv --noHeaders --Ioq=1 --UseEvents=1 --MeasureCompletion=1 --UseQueuePerThread=1 --AllocSize=102400 --NumThreads=1 --NumOpsPerThread=400 --iterations=10 --SrcUSM=1 --DstUSM=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
MemcpyExecute(api=ur Ioq=1 NumOpsPerThread=400 NumThreads=1 AllocSize=102400 MeasureCompletion=1 UseEvents=1 UseQueuePerThread=1 SrcUSM=1 DstUSM=1),6951.033,6925.213,0.98%,6898.338,7120.239,[CPU],[us]

multithread_benchmark_ur MemcpyExecute opsPerThread:100, numThreads:8, allocSize:102400 srcUSM:1 dstUSM:1

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/multithread_benchmark_ur --test=MemcpyExecute --csv --noHeaders --Ioq=1 --UseEvents=1 --MeasureCompletion=1 --UseQueuePerThread=1 --AllocSize=102400 --NumThreads=8 --NumOpsPerThread=100 --iterations=10 --SrcUSM=1 --DstUSM=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
MemcpyExecute(api=ur Ioq=1 NumOpsPerThread=100 NumThreads=8 AllocSize=102400 MeasureCompletion=1 UseEvents=1 UseQueuePerThread=1 SrcUSM=1 DstUSM=1),17390.642,17365.414,3.08%,16380.188,18313.733,[CPU],[us]

multithread_benchmark_ur MemcpyExecute opsPerThread:400, numThreads:8, allocSize:1024 srcUSM:1 dstUSM:1

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/multithread_benchmark_ur --test=MemcpyExecute --csv --noHeaders --Ioq=1 --UseEvents=1 --MeasureCompletion=1 --UseQueuePerThread=1 --AllocSize=1024 --NumThreads=8 --NumOpsPerThread=400 --iterations=1000 --SrcUSM=1 --DstUSM=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
MemcpyExecute(api=ur Ioq=1 NumOpsPerThread=400 NumThreads=8 AllocSize=1024 MeasureCompletion=1 UseEvents=1 UseQueuePerThread=1 SrcUSM=1 DstUSM=1),48244.042,48062.201,2.18%,46034.409,55293.616,[CPU],[us]

multithread_benchmark_ur MemcpyExecute opsPerThread:10, numThreads:16, allocSize:1024 srcUSM:1 dstUSM:1

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/multithread_benchmark_ur --test=MemcpyExecute --csv --noHeaders --Ioq=1 --UseEvents=1 --MeasureCompletion=1 --UseQueuePerThread=1 --AllocSize=1024 --NumThreads=16 --NumOpsPerThread=10 --iterations=10000 --SrcUSM=1 --DstUSM=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
MemcpyExecute(api=ur Ioq=1 NumOpsPerThread=10 NumThreads=16 AllocSize=1024 MeasureCompletion=1 UseEvents=1 UseQueuePerThread=1 SrcUSM=1 DstUSM=1),2104.296,2053.234,24.00%,1499.772,16538.909,[CPU],[us]

multithread_benchmark_ur MemcpyExecute opsPerThread:400, numThreads:1, allocSize:102400 srcUSM:0 dstUSM:1

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/multithread_benchmark_ur --test=MemcpyExecute --csv --noHeaders --Ioq=1 --UseEvents=1 --MeasureCompletion=1 --UseQueuePerThread=1 --AllocSize=102400 --NumThreads=1 --NumOpsPerThread=400 --iterations=10 --SrcUSM=0 --DstUSM=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
MemcpyExecute(api=ur Ioq=1 NumOpsPerThread=400 NumThreads=1 AllocSize=102400 MeasureCompletion=1 UseEvents=1 UseQueuePerThread=1 SrcUSM=0 DstUSM=1),7477.409,7434.794,1.74%,7329.287,7736.878,[CPU],[us]

multithread_benchmark_ur MemcpyExecute opsPerThread:100, numThreads:8, allocSize:102400 srcUSM:0 dstUSM:1

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/multithread_benchmark_ur --test=MemcpyExecute --csv --noHeaders --Ioq=1 --UseEvents=1 --MeasureCompletion=1 --UseQueuePerThread=1 --AllocSize=102400 --NumThreads=8 --NumOpsPerThread=100 --iterations=10 --SrcUSM=0 --DstUSM=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
MemcpyExecute(api=ur Ioq=1 NumOpsPerThread=100 NumThreads=8 AllocSize=102400 MeasureCompletion=1 UseEvents=1 UseQueuePerThread=1 SrcUSM=0 DstUSM=1),8727.882,8604.356,4.12%,8400.179,9720.852,[CPU],[us]

multithread_benchmark_ur MemcpyExecute opsPerThread:400, numThreads:8, allocSize:1024 srcUSM:0 dstUSM:1

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/multithread_benchmark_ur --test=MemcpyExecute --csv --noHeaders --Ioq=1 --UseEvents=1 --MeasureCompletion=1 --UseQueuePerThread=1 --AllocSize=1024 --NumThreads=8 --NumOpsPerThread=400 --iterations=1000 --SrcUSM=0 --DstUSM=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
MemcpyExecute(api=ur Ioq=1 NumOpsPerThread=400 NumThreads=8 AllocSize=1024 MeasureCompletion=1 UseEvents=1 UseQueuePerThread=1 SrcUSM=0 DstUSM=1),25993.468,25797.526,2.05%,24795.513,27351.061,[CPU],[us]

multithread_benchmark_ur MemcpyExecute opsPerThread:10, numThreads:16, allocSize:1024 srcUSM:0 dstUSM:1

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/multithread_benchmark_ur --test=MemcpyExecute --csv --noHeaders --Ioq=1 --UseEvents=1 --MeasureCompletion=1 --UseQueuePerThread=1 --AllocSize=1024 --NumThreads=16 --NumOpsPerThread=10 --iterations=10000 --SrcUSM=0 --DstUSM=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
MemcpyExecute(api=ur Ioq=1 NumOpsPerThread=10 NumThreads=16 AllocSize=1024 MeasureCompletion=1 UseEvents=1 UseQueuePerThread=1 SrcUSM=0 DstUSM=1),1241.472,1165.576,49.23%,896.689,15236.910,[CPU],[us]

multithread_benchmark_ur MemcpyExecute opsPerThread:4096, numThreads:1, allocSize:1024 srcUSM:0 dstUSM:1 without events

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/multithread_benchmark_ur --test=MemcpyExecute --csv --noHeaders --Ioq=1 --UseEvents=0 --MeasureCompletion=1 --UseQueuePerThread=1 --AllocSize=1024 --NumThreads=1 --NumOpsPerThread=4096 --iterations=10 --SrcUSM=0 --DstUSM=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
MemcpyExecute(api=ur Ioq=1 NumOpsPerThread=4096 NumThreads=1 AllocSize=1024 MeasureCompletion=1 UseEvents=0 UseQueuePerThread=1 SrcUSM=0 DstUSM=1),40398.887,40367.797,0.58%,40120.149,41032.889,[CPU],[us]

multithread_benchmark_ur MemcpyExecute opsPerThread:4096, numThreads:4, allocSize:1024 srcUSM:0 dstUSM:1 without events

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/multithread_benchmark_ur --test=MemcpyExecute --csv --noHeaders --Ioq=1 --UseEvents=0 --MeasureCompletion=1 --UseQueuePerThread=1 --AllocSize=1024 --NumThreads=4 --NumOpsPerThread=4096 --iterations=10 --SrcUSM=0 --DstUSM=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
MemcpyExecute(api=ur Ioq=1 NumOpsPerThread=4096 NumThreads=4 AllocSize=1024 MeasureCompletion=1 UseEvents=0 UseQueuePerThread=1 SrcUSM=0 DstUSM=1),111945.663,112027.609,0.31%,111302.424,112491.286,[CPU],[us]

api_overhead_benchmark_ur SubmitKernel out of order CPU count

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=ur Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),101892.560,101833.000,6.02%,101725.000,2037188.000,[CPU],hw instructions [count]
SubmitKernel(api=ur Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),15.830,15.633,344.85%,15.170,17277.622,[CPU],time [us]

api_overhead_benchmark_ur SubmitKernel out of order

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=ur Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),101892.560,101833.000,6.02%,101725.000,2037188.000,[CPU],hw instructions [count]
SubmitKernel(api=ur Profiling=0 Ioq=0 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),15.830,15.633,344.85%,15.170,17277.622,[CPU],time [us]

api_overhead_benchmark_ur SubmitKernel in order CPU count

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=ur Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),107017.796,106951.000,3.97%,106951.000,1447342.000,[CPU],hw instructions [count]
SubmitKernel(api=ur Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),16.034,15.866,307.56%,15.208,15610.462,[CPU],time [us]

api_overhead_benchmark_ur SubmitKernel in order

Environment Variables:

Command:

/home/pmdk/bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1

Output:

TestCase,Mean,Median,StdDev,Min,Max,Type
SubmitKernel(api=ur Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),107017.796,106951.000,3.97%,106951.000,1447342.000,[CPU],hw instructions [count]
SubmitKernel(api=ur Profiling=0 Ioq=1 DiscardEvents=0 NumKernels=10 KernelExecTime=1 MeasureCompletion=0),16.047,15.888,283.94%,15.275,14423.851,[CPU],time [us]

Velocity-Bench Hashtable

Environment Variables:

Command:

/home/pmdk/bench_workdir/hashtable/hashtable_sycl --no-verify

Output:

hashtable - total time for whole calculation: 0.352143 s
381.145668 million keys/second

Velocity-Bench Bitcracker

Environment Variables:

Command:

/home/pmdk/bench_workdir/bitcracker/bitcracker -f /home/pmdk/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/img_win8_user_hash.txt -d /home/pmdk/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/user_passwords_60000.txt -b 60000

Output:

---------> BitCracker: BitLocker password cracking tool <---------

==================================
Retrieving Info

Reading hash file "/home/pmdk/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/img_win8_user_hash.txt"

              Attack

================================================
Type of attack: User Password
Psw per thread: 1
max_num_pswd_per_read: 60000
Dictionary: /home/pmdk/bench_workdir/velocity-bench-repo/bitcracker/hash_pass/user_passwords_60000.txt
MAC Comparison (-m): Yes

Iter: 1, num passwords read: 60000
Kernel execution:
Effective passwords: 60000
Passwords Range:
npknpByH7N2m3OnLNH1X9DJxLrzIFWk
.....
dL_7uuf3QCz-c6K3xDu0

================================================
Bitcracker attack completed
Total passwords evaluated: 60000
Password not found!

time to subtract from total: 0.00401978 s
bitcracker - total time for whole calculation: 35.0941 s

Velocity-Bench CudaSift

Environment Variables:

Command:

/home/pmdk/bench_workdir/cudaSift/cudaSift

Output:

UNKN:

UNKN: ==================================================
UNKN: User input parameters:
UNKN: Trace: ../../inputData
UNKN: ==================================================
UNKN:

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1158 1266 31.4418% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1228 1264 33.3424% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1229 1261 33.3695% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1092 1260 29.6497% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1097 1271 29.7855% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1165 1279 31.6318% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1127 1263 30.6001% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1079 1261 29.2968% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1058 1265 28.7266% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1103 1259 29.9484% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1145 1273 31.0888% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1234 1266 33.5053% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1228 1263 33.3424% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1124 1263 30.5186% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1236 1269 33.5596% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1240 1273 33.6682% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1231 1268 33.4238% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1063 1257 28.8623% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1115 1249 30.2742% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1238 1272 33.6139% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1084 1253 29.4325% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1204 1254 32.6907% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1234 1271 33.5053% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1228 1263 33.3424% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1074 1257 29.161% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1115 1274 30.2742% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1095 1261 29.7312% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1239 1274 33.6411% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1108 1258 30.0842% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1108 1268 30.0842% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1139 1271 30.9259% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1138 1268 30.8987% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1065 1264 28.9166% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1135 1269 30.8173% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1080 1260 29.3239% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1228 1263 33.3424% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1229 1263 33.3695% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1237 1272 33.5868% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1224 1258 33.2338% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1216 1277 33.0166% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1144 1273 31.0616% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1237 1269 33.5868% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1115 1260 30.2742% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1058 1263 28.7266% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1236 1272 33.5596% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1179 1268 32.0119% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1221 1263 33.1523% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1220 1258 33.1252% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1244 1277 33.7768% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Image size = (1920,1080)
Initializing data...
Number of original features: 3683 3933
Number of matching features: 1233 1269 33.4781% 1 2

Performing data verification
Data verification is SUCCESSFUL.

Avg workload time = 203.186 ms

Velocity-Bench Easywave

Environment Variables:

Command:

/home/pmdk/bench_workdir/easywave/easyWave_sycl -grid /home/pmdk/bench_workdir/data/easywave/examples/e2Asean.grd -source /home/pmdk/bench_workdir/data/easywave/examples/BengkuluSept2007.flt -time 120

Output:

MAIN: Starting SYCL main program
MAIN: Attempting to clean up previous eWave tsunami files
MAIN: Clean up completed
SYCL: SYCL Queue initialization successful
SYCL: Using SYCL device : Intel(R) Data Center GPU Max 1100 (Driver version 1.3.30049+10)
SYCL: Platform : Intel(R) oneAPI Unified Runtime over Level-Zero
MAIN: Program successfully completed

Velocity-Bench QuickSilver

Environment Variables:

QS_DEVICE=GPU

Command:

/home/pmdk/bench_workdir/QuickSilver/qs -i /home/pmdk/bench_workdir/velocity-bench-repo/QuickSilver/Examples/AllScattering/scatteringOnly.inp

Output:

Copyright (c) 2016
Lawrence Livermore National Security, LLC
All Rights Reserved
Quicksilver Version :
Quicksilver Git Hash :
MPI Version : 3.0
Number of MPI ranks : 1
Number of OpenMP Threads: 1
Number of OpenMP CPUs : 1

Loading params
Finished loading params
Simulation:
dt: 1e-08
fMax: 0.1
inputFile: /home/pmdk/bench_workdir/velocity-bench-repo/QuickSilver/Examples/AllScattering/scatteringOnly.inp
energySpectrum:
boundaryCondition: octant
loadBalance: 1
cycleTimers: 0
debugThreads: 0
lx: 100
ly: 100
lz: 100
nParticles: 10000000
batchSize: 0
nBatches: 10
nSteps: 10
nx: 10
ny: 10
nz: 10
seed: 1029384756
xDom: 0
yDom: 0
zDom: 0
eMax: 20
eMin: 1e-09
nGroups: 230
lowWeightCutoff: 0.001
bTally: 1
fTally: 1
cTally: 1
coralBenchmark: 0
crossSectionsOut:

Geometry:
material: sourceMaterial
shape: brick
xMax: 100
xMin: 0
yMax: 100
yMin: 0
zMax: 100
zMin: 0

Material:
name: sourceMaterial
mass: 1000
nIsotopes: 10
nReactions: 9
sourceRate: 1e+10
totalCrossSection: 0.1
absorptionCrossSection: flat
fissionCrossSection: flat
scatteringCrossSection: flat
absorptionCrossSectionRatio: 0
fissionCrossSectionRatio: 0
scatteringCrossSectionRatio: 1

CrossSection:
name: flat
A: 0
B: 0
C: 0
D: 0
E: 1
nuBar: 2.4
setting GPU
setting parameters
Building partition 0
Building partition 1
Building partition 2
Building partition 3
Building MC_Domain 0
Building MC_Domain 1
Building MC_Domain 2
Building MC_Domain 3
Starting Consistency Check
Finished Consistency Check
Finished initMesh
Started copyMaterialDatabase_device
Finished copyMaterialDatabase_device
Finished copyNuclearData_device
Finished copyDomainDevice
cycle start source rr split absorb scatter fission produce collisn escape census num_seg scalar_flux cycleInit cycleTracking cycleFinalize
0 0 1000000 0 9000000 0 18533189 0 0 18533189 1151780 8848220 55527935 1.854923e+09 4.301550e-01 6.131320e-01 0.000000e+00
1 8848220 1000000 0 151478 0 34281997 0 0 34281997 1664159 8335539 94633679 5.047651e+09 3.652360e-01 7.542150e-01 0.000000e+00
2 8335539 1000000 0 663717 0 34354432 0 0 34354432 1366771 8632485 95010375 7.705930e+09 3.607410e-01 7.676120e-01 0.000000e+00
3 8632485 1000000 0 367978 0 34302727 0 0 34302727 1242216 8758247 94953591 9.992076e+09 3.912330e-01 8.148650e-01 0.000000e+00
4 8758247 1000000 0 242076 0 34141236 0 0 34141236 1168452 8831871 94599337 1.199834e+10 3.434610e-01 7.891050e-01 0.000000e+00
5 8831871 1000000 0 168070 0 33948724 0 0 33948724 1121156 8878785 94148236 1.377636e+10 3.449110e-01 7.717540e-01 0.000000e+00
6 8878785 1000000 0 120572 0 33760567 0 0 33760567 1089103 8910254 93689264 1.535668e+10 3.453660e-01 7.807020e-01 0.000000e+00
7 8910254 1000000 0 89810 0 33552179 0 0 33552179 1065203 8934861 93216931 1.676993e+10 3.448190e-01 8.054990e-01 0.000000e+00
8 8934861 1000000 0 65491 0 33384605 0 0 33384605 1047720 8952632 92768273 1.804559e+10 3.409150e-01 7.824040e-01 0.000000e+00
9 8952632 1000000 0 47165 0 33198494 0 0 33198494 1033968 8965829 92324678 1.920208e+10 3.428270e-01 7.576950e-01 0.000000e+00

Timer Cumulative Cumulative Cumulative Cumulative Cumulative Cumulative
Name number microSecs microSecs microSecs microSecs Efficiency
of calls min avg max stddev Rating
main 1 1.125e+07 1.125e+07 1.125e+07 0.000e+00 100.00
cycleInit 10 3.610e+06 3.610e+06 3.610e+06 0.000e+00 100.00
cycleTracking 10 7.637e+06 7.637e+06 7.637e+06 0.000e+00 100.00
cycleTracking_Kernel 104 4.922e+06 4.922e+06 4.922e+06 0.000e+00 100.00
cycleTracking_MPI 117 2.175e+05 2.175e+05 2.175e+05 0.000e+00 100.00
cycleTracking_Test_Done 0 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.00
cycleFinalize 20 4.070e+02 4.070e+02 4.070e+02 0.000e+00 100.00
Figure Of Merit 117.96 [Num Mega Segments / Cycle Tracking Time]

Velocity-Bench Sobel Filter

Environment Variables:

OPENCV_IO_MAX_IMAGE_PIXELS=1677721600

Command:

/home/pmdk/bench_workdir/sobel_filter/sobel_filter -i /home/pmdk/bench_workdir/data/sobel_filter/sobel_filter_data/silverfalls_32Kx32K.png -n 5

Output:

SYMN: Welcome to the SYCL version of Sobel filter workload.
SYMN: Input image file: /home/pmdk/bench_workdir/data/sobel_filter/sobel_filter_data/silverfalls_32Kx32K.png
SYMN: Launching SYCL kernel with # of iterations: 5
time to subtract from total: 7.43251 s
sobelfilter - total time for whole calculation: 0.532796 s

Velocity-Bench dl-cifar

Environment Variables:

Command:

/home/pmdk/bench_workdir/dl-cifar/dl-cifar_sycl

Output:

	Welcome to DL-CIFAR workload: SYCL version.

=======================================================================
SYCL: SYCL Queue initialization successful
SYCL: Using SYCL device : Intel(R) Data Center GPU Max 1100 (Driver version 1.3.30049+10)
SYCL: Platform : Intel(R) oneAPI Unified Runtime over Level-Zero
SYCL: Using SYCL device : Intel(R) Data Center GPU Max 1100 (Driver version 1.3.30049+10)
SYCL: Platform : Intel(R) oneAPI Unified Runtime over Level-Zero

WL PARAMS:

WL PARAMS: ==================================================
WL PARAMS: User input parameters:
WL PARAMS: Trace: notrace
WL PARAMS: DL NW size type: WORKLOAD_DEFAULT_SIZE
WL PARAMS: ==================================================
WL PARAMS:

dataFileReadTimer->getTotalOpTime(): 8.8e-05 s
dl-cifar - total time for whole calculation: 23.2001 s

Velocity-Bench dl-mnist

Environment Variables:

NEOReadDebugKeys=1
DisableScratchPages=0

Command:

/home/pmdk/bench_workdir/dl-mnist/dl-mnist-sycl -conv_algo ONEDNN_AUTO

Output:

	Welcome to DL-MNIST workload: SYCL version.

=======================================================================
SYCL: SYCL Queue initialization successful
SYCL: Using SYCL device : Intel(R) Data Center GPU Max 1100 (Driver version 1.3.30049+10)
SYCL: Platform : Intel(R) oneAPI Unified Runtime over Level-Zero
SYCL: Using SYCL device : Intel(R) Data Center GPU Max 1100 (Driver version 1.3.30049+10)
SYCL: Platform : Intel(R) oneAPI Unified Runtime over Level-Zero

WL PARAMS:

WL PARAMS: ==================================================
WL PARAMS: User input parameters:
WL PARAMS: Trace: notrace
WL PARAMS: Tensor management policy: per_layer
WL PARAMS: Convolution algorithm: ONEDNN_AUTO
WL PARAMS: Dataset reader format: NCHW
WL PARAMS: Dry run: YES
WL PARAMS: OneDNN Conv PD memory format: ONEDNN_CONVPD_ANY
WL PARAMS: No of iterations for inference: 400
WL PARAMS: ==================================================
WL PARAMS:

dl-mnist - total time for whole calculation: 2.72 s

Velocity-Bench svm

Environment Variables:

Command:

/home/pmdk/bench_workdir/svm/svm_sycl /home/pmdk/bench_workdir/velocity-bench-repo/svm/SYCL/a9a /home/pmdk/bench_workdir/velocity-bench-repo/svm/SYCL/a.m

Output:

Number of args 3
Using cuSVM (Carpenter)...

Buffering input text file (6989624 B).
Load Done
Starting Training
_C 1.000000
Workgroup Size: 1024
nbrCtas 80
elemsPerCta 1248
threadsPerCta 128
Total run time: 0.064021 seconds
Iter:100
M:97683
N:123
Train done. Calulate Vector counts
Training done

Loading elapsed time : 0.0644 s
Processing elapsed time : 0.0692 s
Storing elapsed time : 0.0021 s
Total elapsed time : 0.1357 s
Result's are correct: 0.0551

Runtime_IndependentDAGTaskThroughput_SingleTask

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=32768

Output:

['Runtime_IndependentDAGTaskThroughput_SingleTask', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '32768', '0.263062', '0.264020', '0.260993', '0.260993 0.264020 0.264174', '0.001794', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_IndependentDAGTaskThroughput_BasicParallelFor

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=32768

Output:

['Runtime_IndependentDAGTaskThroughput_BasicParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '32768', '0.280337', '0.279757', '0.271462', '0.271462 0.279757 0.289791', '0.009178', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_IndependentDAGTaskThroughput_HierarchicalParallelFor

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=32768

Output:

['Runtime_IndependentDAGTaskThroughput_HierarchicalParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '32768', '0.275339', '0.275008', '0.272290', '0.272290 0.275008 0.278719', '0.003227', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_IndependentDAGTaskThroughput_NDRangeParallelFor

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/dag_task_throughput_independent --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/IndependentDAGTaskThroughput_multi.csv --size=32768

Output:

['Runtime_IndependentDAGTaskThroughput_NDRangeParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '32768', '0.281537', '0.274650', '0.272179', '0.272179 0.274650 0.297782', '0.014123', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_DAGTaskThroughput_SingleTask

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/DAGTaskThroughput_multi.csv --size=327680

Output:

['Runtime_DAGTaskThroughput_SingleTask', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '327680', '1.651149', '1.653391', '1.640569', '1.640569 1.653391 1.659488', '0.009657', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_DAGTaskThroughput_BasicParallelFor

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/DAGTaskThroughput_multi.csv --size=327680

Output:

['Runtime_DAGTaskThroughput_BasicParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '327680', '1.726565', '1.725546', '1.722776', '1.722776 1.725546 1.731373', '0.004389', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_DAGTaskThroughput_HierarchicalParallelFor

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/DAGTaskThroughput_multi.csv --size=327680

Output:

['Runtime_DAGTaskThroughput_HierarchicalParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '327680', '1.701583', '1.702714', '1.694444', '1.694444 1.702714 1.707591', '0.006646', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Runtime_DAGTaskThroughput_NDRangeParallelFor

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/dag_task_throughput_sequential --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/DAGTaskThroughput_multi.csv --size=327680

Output:

['Runtime_DAGTaskThroughput_NDRangeParallelFor', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '327680', '1.662935', '1.664018', '1.659255', '1.659255 1.664018 1.665531', '0.003275', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MicroBench_HostDeviceBandwidth_1D_H2D_Contiguous

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/host_device_bandwidth --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/HostDeviceBandwidth_multi.csv --size=512

Output:

['MicroBench_HostDeviceBandwidth_1D_H2D_Contiguous', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.004933', '0.004419', '0.004396', '0.004396 0.004419 0.005983', '0.000910', '28.436765', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.125000']

MicroBench_HostDeviceBandwidth_2D_H2D_Contiguous

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/host_device_bandwidth --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/HostDeviceBandwidth_multi.csv --size=512

Output:

['MicroBench_HostDeviceBandwidth_2D_H2D_Contiguous', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.004436', '0.004447', '0.004351', '0.004351 0.004447 0.004512', '0.000081', '28.732250', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.125000']

MicroBench_HostDeviceBandwidth_3D_H2D_Contiguous

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/host_device_bandwidth --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/HostDeviceBandwidth_multi.csv --size=512

Output:

['MicroBench_HostDeviceBandwidth_3D_H2D_Contiguous', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.004519', '0.004493', '0.004491', '0.004491 0.004493 0.004572', '0.000046', '27.831629', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.125000']

MicroBench_HostDeviceBandwidth_1D_D2H_Contiguous

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/host_device_bandwidth --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/HostDeviceBandwidth_multi.csv --size=512

Output:

['MicroBench_HostDeviceBandwidth_1D_D2H_Contiguous', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.004672', '0.004640', '0.004635', '0.004635 0.004640 0.004739', '0.000059', '26.968681', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.125000']

MicroBench_HostDeviceBandwidth_2D_D2H_Contiguous

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/host_device_bandwidth --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/HostDeviceBandwidth_multi.csv --size=512

Output:

['MicroBench_HostDeviceBandwidth_2D_D2H_Contiguous', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.618089', '0.618152', '0.617938', '0.617938 0.618152 0.618178', '0.000132', '0.202286', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.125000']

MicroBench_HostDeviceBandwidth_3D_D2H_Contiguous

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/host_device_bandwidth --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/HostDeviceBandwidth_multi.csv --size=512

Output:

['MicroBench_HostDeviceBandwidth_3D_D2H_Contiguous', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.618168', '0.618166', '0.618157', '0.618157 0.618166 0.618182', '0.000013', '0.202214', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.125000']

MicroBench_HostDeviceBandwidth_1D_H2D_Strided

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/host_device_bandwidth --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/HostDeviceBandwidth_multi.csv --size=512

Output:

['MicroBench_HostDeviceBandwidth_1D_H2D_Strided', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.004337', '0.004319', '0.004268', '0.004268 0.004319 0.004423', '0.000079', '29.286666', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.125000']

MicroBench_HostDeviceBandwidth_2D_H2D_Strided

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/host_device_bandwidth --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/HostDeviceBandwidth_multi.csv --size=512

Output:

['MicroBench_HostDeviceBandwidth_2D_H2D_Strided', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.004550', '0.004572', '0.004472', '0.004472 0.004572 0.004607', '0.000070', '27.950206', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.125000']

MicroBench_HostDeviceBandwidth_3D_H2D_Strided

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/host_device_bandwidth --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/HostDeviceBandwidth_multi.csv --size=512

Output:

['MicroBench_HostDeviceBandwidth_3D_H2D_Strided', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.004609', '0.004604', '0.004598', '0.004598 0.004604 0.004626', '0.000014', '27.183652', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.125000']

MicroBench_HostDeviceBandwidth_1D_D2H_Strided

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/host_device_bandwidth --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/HostDeviceBandwidth_multi.csv --size=512

Output:

['MicroBench_HostDeviceBandwidth_1D_D2H_Strided', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.004621', '0.004614', '0.004611', '0.004611 0.004614 0.004638', '0.000015', '27.110110', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.125000']

MicroBench_HostDeviceBandwidth_2D_D2H_Strided

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/host_device_bandwidth --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/HostDeviceBandwidth_multi.csv --size=512

Output:

['MicroBench_HostDeviceBandwidth_2D_D2H_Strided', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.617409', '0.617440', '0.617324', '0.617324 0.617440 0.617461', '0.000074', '0.202487', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.125000']

MicroBench_HostDeviceBandwidth_3D_D2H_Strided

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/host_device_bandwidth --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/HostDeviceBandwidth_multi.csv --size=512

Output:

['MicroBench_HostDeviceBandwidth_3D_D2H_Strided', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.617402', '0.617337', '0.617320', '0.617320 0.617337 0.617549', '0.000128', '0.202488', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '0.125000']

MicroBench_LocalMem_int32_4096

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/local_mem --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/LocalMem_multi.csv --size=10240000

Output:

['MicroBench_LocalMem_int32_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '10240000', '0.029914', '0.029911', '0.029895', '0.029895 0.029911 0.029935', '0.000020', '10436.433939', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '312.000000']

MicroBench_LocalMem_fp32_4096

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/local_mem --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/LocalMem_multi.csv --size=10240000

Output:

['MicroBench_LocalMem_fp32_4096', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '10240000', '0.029908', '0.029912', '0.029827', '0.029827 0.029912 0.029985', '0.000079', '10460.215977', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '312.000000']

Pattern_Reduction_NDRange_int32

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/Pattern_Reduction_multi.csv --size=10240000

Output:

['Pattern_Reduction_NDRange_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '10240000', '0.016785', '0.016870', '0.016360', '0.016360 0.016870 0.017127', '0.000390', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_Reduction_Hierarchical_int32

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/reduction --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/Pattern_Reduction_multi.csv --size=10240000

Output:

['Pattern_Reduction_Hierarchical_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '10240000', '0.016900', '0.016824', '0.016810', '0.016810 0.016824 0.017065', '0.000143', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_NDRange_int32

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/ScalarProduct_multi.csv --size=102400000

Output:

['ScalarProduct_NDRange_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '102400000', '0.003884', '0.003902', '0.003824', '0.003824 0.003902 0.003926', '0.000053', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_NDRange_int64

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/ScalarProduct_multi.csv --size=102400000

Output:

['ScalarProduct_NDRange_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '102400000', '0.005540', '0.005553', '0.005512', '0.005512 0.005553 0.005554', '0.000024', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_NDRange_fp32

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/ScalarProduct_multi.csv --size=102400000

Output:

['ScalarProduct_NDRange_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '102400000', '0.003835', '0.003806', '0.003746', '0.003746 0.003806 0.003952', '0.000106', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_Hierarchical_int32

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/ScalarProduct_multi.csv --size=102400000

Output:

['ScalarProduct_Hierarchical_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '102400000', '0.010540', '0.010543', '0.010516', '0.010516 0.010543 0.010560', '0.000022', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_Hierarchical_int64

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/ScalarProduct_multi.csv --size=102400000

Output:

['ScalarProduct_Hierarchical_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '102400000', '0.011489', '0.011488', '0.011487', '0.011487 0.011488 0.011492', '0.000003', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

ScalarProduct_Hierarchical_fp32

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/scalar_prod --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/ScalarProduct_multi.csv --size=102400000

Output:

['ScalarProduct_Hierarchical_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '102400000', '0.010171', '0.010180', '0.010138', '0.010138 0.010180 0.010195', '0.000030', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_NDRange_int16

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/Pattern_SegmentedReduction_multi.csv --size=102400000

Output:

['Pattern_SegmentedReduction_NDRange_int16', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '102400000', '0.002276', '0.002269', '0.002269', '0.002269 0.002269 0.002290', '0.000012', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_NDRange_int32

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/Pattern_SegmentedReduction_multi.csv --size=102400000

Output:

['Pattern_SegmentedReduction_NDRange_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '102400000', '0.002172', '0.002169', '0.002167', '0.002167 0.002169 0.002181', '0.000008', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_NDRange_int64

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/Pattern_SegmentedReduction_multi.csv --size=102400000

Output:

['Pattern_SegmentedReduction_NDRange_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '102400000', '0.002348', '0.002345', '0.002344', '0.002344 0.002345 0.002354', '0.000005', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_NDRange_fp32

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/Pattern_SegmentedReduction_multi.csv --size=102400000

Output:

['Pattern_SegmentedReduction_NDRange_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '102400000', '0.002173', '0.002175', '0.002167', '0.002167 0.002175 0.002178', '0.000006', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_Hierarchical_int16

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/Pattern_SegmentedReduction_multi.csv --size=102400000

Output:

['Pattern_SegmentedReduction_Hierarchical_int16', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '102400000', '0.011811', '0.011806', '0.011799', '0.011799 0.011806 0.011826', '0.000014', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_Hierarchical_int32

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/Pattern_SegmentedReduction_multi.csv --size=102400000

Output:

['Pattern_SegmentedReduction_Hierarchical_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '102400000', '0.011599', '0.011597', '0.011592', '0.011592 0.011597 0.011608', '0.000008', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_Hierarchical_int64

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/Pattern_SegmentedReduction_multi.csv --size=102400000

Output:

['Pattern_SegmentedReduction_Hierarchical_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '102400000', '0.011791', '0.011791', '0.011789', '0.011789 0.011791 0.011794', '0.000002', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Pattern_SegmentedReduction_Hierarchical_fp32

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/segmentedreduction --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/Pattern_SegmentedReduction_multi.csv --size=102400000

Output:

['Pattern_SegmentedReduction_Hierarchical_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '102400000', '0.011600', '0.011597', '0.011596', '0.011596 0.011597 0.011607', '0.000006', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Allocation_latency_fp32_device

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/usm_allocation_latency --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/USM_Allocation_latency_multi.csv --size=1024000000

Output:

['USM_Allocation_latency_fp32_device', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '1024000000', '0.000063', '0.000069', '0.000046', '0.000046 0.000069 0.000073', '0.000015', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Allocation_latency_fp32_host

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/usm_allocation_latency --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/USM_Allocation_latency_multi.csv --size=1024000000

Output:

['USM_Allocation_latency_fp32_host', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '1024000000', '0.037429', '0.037328', '0.037322', '0.037322 0.037328 0.037638', '0.000181', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Allocation_latency_fp32_shared

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/usm_allocation_latency --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/USM_Allocation_latency_multi.csv --size=1024000000

Output:

['USM_Allocation_latency_fp32_shared', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '1024000000', '0.000070', '0.000075', '0.000056', '0.000056 0.000075 0.000078', '0.000012', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_device_1:1mix_with_init_no_prefetch

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/USM_Instr_Mix_multi.csv --size=8192

Output:

['USM_Instr_Mix_fp32_device_1:1mix_with_init_no_prefetch', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '8192', '0.002203', '0.001710', '0.001650', '0.001650 0.001710 0.003249', '0.000907', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_host_1:1mix_with_init_no_prefetch

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/USM_Instr_Mix_multi.csv --size=8192

Output:

['USM_Instr_Mix_fp32_host_1:1mix_with_init_no_prefetch', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '8192', '0.001048', '0.001042', '0.001040', '0.001040 0.001042 0.001061', '0.000012', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_device_1:1mix_no_init_no_prefetch

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/USM_Instr_Mix_multi.csv --size=8192

Output:

['USM_Instr_Mix_fp32_device_1:1mix_no_init_no_prefetch', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '8192', '0.001837', '0.001816', '0.001809', '0.001809 0.001816 0.001886', '0.000042', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

USM_Instr_Mix_fp32_host_1:1mix_no_init_no_prefetch

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/usm_instr_mix --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/USM_Instr_Mix_multi.csv --size=8192

Output:

['USM_Instr_Mix_fp32_host_1:1mix_no_init_no_prefetch', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '8192', '0.001199', '0.001202', '0.001192', '0.001192 0.001202 0.001204', '0.000006', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

VectorAddition_int32

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/vec_add --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/VectorAddition_multi.csv --size=102400000

Output:

['VectorAddition_int32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '102400000', '0.001570', '0.001599', '0.001450', '0.001450 0.001599 0.001659', '0.000108', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

VectorAddition_int64

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/vec_add --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/VectorAddition_multi.csv --size=102400000

Output:

['VectorAddition_int64', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '102400000', '0.003107', '0.003090', '0.003062', '0.003062 0.003090 0.003167', '0.000054', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

VectorAddition_fp32

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/vec_add --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/VectorAddition_multi.csv --size=102400000

Output:

['VectorAddition_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '102400000', '0.001513', '0.001449', '0.001446', '0.001446 0.001449 0.001644', '0.000114', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_2mm

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/2mm --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/2mm.csv --size=512

Output:

['Polybench_2mm', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.001226', '0.001227', '0.001218', '0.001218 0.001227 0.001231', '0.000006', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_3mm

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/3mm --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/3mm.csv --size=512

Output:

['Polybench_3mm', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '512', '0.001734', '0.001732', '0.001726', '0.001726 0.001732 0.001743', '0.000008', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Polybench_Atax

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/atax --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/Atax.csv --size=8192

Output:

['Polybench_Atax', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '8192', '0.006812', '0.006863', '0.006685', '0.006685 0.006863 0.006887', '0.000111', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

Kmeans_fp32

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/kmeans --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/Kmeans.csv --size=700000000

Output:

['Kmeans_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '700000000', '0.016052', '0.016054', '0.016039', '0.016039 0.016054 0.016063', '0.000012', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

LinearRegressionCoeff_fp32

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/lin_reg_coeff --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/LinearRegressionCoeff.csv --size=1638400000

Output:

['LinearRegressionCoeff_fp32', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '1638400000', '0.851504', '0.842146', '0.842133', '0.842133 0.842146 0.870234', '0.016221', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

MolecularDynamics

Environment Variables:

Command:

/home/pmdk/bench_workdir/sycl-bench-build/mol_dyn --warmup-run --num-runs=3 --output=/home/pmdk/bench_workdir/MolecularDynamics.csv --size=8196

Output:

['MolecularDynamics', 'PASS', 'Intel(R) Data Center GPU Max 1100', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', '256', '8196', '0.000037', '0.000029', '0.000026', '0.000026 0.000029 0.000054', '0.000015', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'LLVM (Intel DPC++)', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A', 'N/A']

llama.cpp Prompt Processing Batched 128

Environment Variables:

Command:

/home/pmdk/bench_workdir/llamacpp-build/bin/llama-bench --output csv -n 128 -p 512 -b 128,256,512 --numa isolate -t 56 --model /home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf

Output:

build_commit,build_number,cuda,vulkan,kompute,metal,sycl,rpc,gpu_blas,blas,cpu_info,gpu_info,model_filename,model_type,model_size,model_n_params,n_batch,n_ubatch,n_threads,cpu_mask,cpu_strict,poll,type_k,type_v,n_gpu_layers,split_mode,main_gpu,no_kv_offload,flash_attn,tensor_split,use_mmap,embeddings,n_prompt,n_gen,test_time,avg_ns,stddev_ns,avg_ts,stddev_ts
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","128","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","512","0","2025-01-06T16:47:15Z","635752903","32081498","806.955130","39.904814"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","128","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","0","128","2025-01-06T16:47:22Z","2041314604","4039239","62.704886","0.123910"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","256","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","512","0","2025-01-06T16:47:32Z","598392288","43481354","858.995674","58.012237"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","256","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","0","128","2025-01-06T16:47:36Z","2043667206","1406952","62.632530","0.043083"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","512","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","512","0","2025-01-06T16:47:46Z","1138233667","19868126","449.928254","7.769838"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","512","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","0","128","2025-01-06T16:47:53Z","2042379691","1255018","62.672009","0.038493"

llama.cpp Text Generation Batched 128

Environment Variables:

Command:

/home/pmdk/bench_workdir/llamacpp-build/bin/llama-bench --output csv -n 128 -p 512 -b 128,256,512 --numa isolate -t 56 --model /home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf

Output:

build_commit,build_number,cuda,vulkan,kompute,metal,sycl,rpc,gpu_blas,blas,cpu_info,gpu_info,model_filename,model_type,model_size,model_n_params,n_batch,n_ubatch,n_threads,cpu_mask,cpu_strict,poll,type_k,type_v,n_gpu_layers,split_mode,main_gpu,no_kv_offload,flash_attn,tensor_split,use_mmap,embeddings,n_prompt,n_gen,test_time,avg_ns,stddev_ns,avg_ts,stddev_ts
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","128","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","512","0","2025-01-06T16:48:04Z","622983625","26529008","823.012168","34.080745"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","128","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","0","128","2025-01-06T16:48:11Z","2044840632","1875478","62.596607","0.057337"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","256","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","512","0","2025-01-06T16:48:21Z","578479573","14885617","885.550092","22.896270"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","256","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","0","128","2025-01-06T16:48:25Z","2046364142","2099504","62.550015","0.064107"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","512","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","512","0","2025-01-06T16:48:35Z","1188141769","16854461","430.994207","6.098300"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","512","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","0","128","2025-01-06T16:48:42Z","2043195392","849987","62.646978","0.026033"

llama.cpp Prompt Processing Batched 256

Environment Variables:

Command:

/home/pmdk/bench_workdir/llamacpp-build/bin/llama-bench --output csv -n 128 -p 512 -b 128,256,512 --numa isolate -t 56 --model /home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf

Output:

build_commit,build_number,cuda,vulkan,kompute,metal,sycl,rpc,gpu_blas,blas,cpu_info,gpu_info,model_filename,model_type,model_size,model_n_params,n_batch,n_ubatch,n_threads,cpu_mask,cpu_strict,poll,type_k,type_v,n_gpu_layers,split_mode,main_gpu,no_kv_offload,flash_attn,tensor_split,use_mmap,embeddings,n_prompt,n_gen,test_time,avg_ns,stddev_ns,avg_ts,stddev_ts
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","128","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","512","0","2025-01-06T16:46:28Z","638082041","54844718","806.736839","63.050707"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","128","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","0","128","2025-01-06T16:46:33Z","2047362362","1913145","62.519509","0.058373"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","256","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","512","0","2025-01-06T16:46:43Z","576964097","4508453","887.446647","6.891248"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","256","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","0","128","2025-01-06T16:46:47Z","2047864697","1342971","62.504151","0.040931"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","512","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","512","0","2025-01-06T16:46:57Z","1131731137","20869555","452.524837","8.173082"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","512","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","0","128","2025-01-06T16:47:04Z","2043865473","1994235","62.626479","0.061072"

llama.cpp Text Generation Batched 256

Environment Variables:

Command:

/home/pmdk/bench_workdir/llamacpp-build/bin/llama-bench --output csv -n 128 -p 512 -b 128,256,512 --numa isolate -t 56 --model /home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf

Output:

build_commit,build_number,cuda,vulkan,kompute,metal,sycl,rpc,gpu_blas,blas,cpu_info,gpu_info,model_filename,model_type,model_size,model_n_params,n_batch,n_ubatch,n_threads,cpu_mask,cpu_strict,poll,type_k,type_v,n_gpu_layers,split_mode,main_gpu,no_kv_offload,flash_attn,tensor_split,use_mmap,embeddings,n_prompt,n_gen,test_time,avg_ns,stddev_ns,avg_ts,stddev_ts
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","128","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","512","0","2025-01-06T16:49:43Z","662552056","85282427","782.180419","91.630092"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","128","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","0","128","2025-01-06T16:49:48Z","2048522873","2567646","62.484126","0.078235"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","256","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","512","0","2025-01-06T16:49:59Z","570003905","4464065","898.283165","6.976100"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","256","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","0","128","2025-01-06T16:50:02Z","2046362580","1077905","62.550024","0.032933"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","512","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","512","0","2025-01-06T16:50:12Z","1144299094","23972700","447.592082","9.345232"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","512","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","0","128","2025-01-06T16:50:19Z","2047195630","1234497","62.524575","0.037664"

llama.cpp Prompt Processing Batched 512

Environment Variables:

Command:

/home/pmdk/bench_workdir/llamacpp-build/bin/llama-bench --output csv -n 128 -p 512 -b 128,256,512 --numa isolate -t 56 --model /home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf

Output:

build_commit,build_number,cuda,vulkan,kompute,metal,sycl,rpc,gpu_blas,blas,cpu_info,gpu_info,model_filename,model_type,model_size,model_n_params,n_batch,n_ubatch,n_threads,cpu_mask,cpu_strict,poll,type_k,type_v,n_gpu_layers,split_mode,main_gpu,no_kv_offload,flash_attn,tensor_split,use_mmap,embeddings,n_prompt,n_gen,test_time,avg_ns,stddev_ns,avg_ts,stddev_ts
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","128","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","512","0","2025-01-06T16:48:54Z","637117080","26310941","804.719577","33.291820"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","128","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","0","128","2025-01-06T16:49:01Z","2043327008","2706129","62.643022","0.082823"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","256","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","512","0","2025-01-06T16:49:11Z","577879142","7187363","886.108062","11.020807"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","256","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","0","128","2025-01-06T16:49:15Z","2047475824","1946963","62.516046","0.059357"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","512","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","512","0","2025-01-06T16:49:25Z","1139419152","7171434","449.366041","2.836553"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","512","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","0","128","2025-01-06T16:49:32Z","2042459053","561629","62.669559","0.017204"

llama.cpp Text Generation Batched 512

Environment Variables:

Command:

/home/pmdk/bench_workdir/llamacpp-build/bin/llama-bench --output csv -n 128 -p 512 -b 128,256,512 --numa isolate -t 56 --model /home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf

Output:

build_commit,build_number,cuda,vulkan,kompute,metal,sycl,rpc,gpu_blas,blas,cpu_info,gpu_info,model_filename,model_type,model_size,model_n_params,n_batch,n_ubatch,n_threads,cpu_mask,cpu_strict,poll,type_k,type_v,n_gpu_layers,split_mode,main_gpu,no_kv_offload,flash_attn,tensor_split,use_mmap,embeddings,n_prompt,n_gen,test_time,avg_ns,stddev_ns,avg_ts,stddev_ts
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","128","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","512","0","2025-01-06T16:48:54Z","637117080","26310941","804.719577","33.291820"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","128","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","0","128","2025-01-06T16:49:01Z","2043327008","2706129","62.643022","0.082823"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","256","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","512","0","2025-01-06T16:49:11Z","577879142","7187363","886.108062","11.020807"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","256","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","0","128","2025-01-06T16:49:15Z","2047475824","1946963","62.516046","0.059357"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","512","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","512","0","2025-01-06T16:49:25Z","1139419152","7171434","449.366041","2.836553"
"1ee9eea0","4073","0","0","0","0","1","0","1","1","INTEL(R) XEON(R) PLATINUM 8580","Intel(R) Data Center GPU Max 1100","/home/pmdk/bench_workdir/models/Phi-3-mini-4k-instruct-q4.gguf","phi3 3B Q4_K - Medium","2392493568","3821079552","512","512","56","0x0","0","50","f16","f16","99","layer","0","0","0","0.00","1","0","0","128","2025-01-06T16:49:32Z","2042459053","561629","62.669559","0.017204"

alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4 glibc

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2622.28,1880.6,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,707.07,707.068,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1408.08,1347.28,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,745.747,745.713,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,833.688,807.323,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,167.247,167.242,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2232.82,2232.55,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,185.935,185.901,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2018.54,2017.57,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,188.994,188.989,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,3670.1,3665.46,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,265.476,265.469,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3416.31,3410.38,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.903,286.899,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,308.735,306.918,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,216.685,216.681,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,249.939,248.893,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,206.93,206.924,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,1154.97,1126.46,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,954.757,954.711,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,30034.2,28619.9,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4118.68,4118.55,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138832,87602.1,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,31016.1,31015.8,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.10821e+06,1.10783e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,153772,153769,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15353e+06,1.15301e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,139405,139401,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,40726.5,40256,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15612.8,15612.4,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,72745.9,72725.6,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,28207.4,28201.6,ns,,,,,

alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1 glibc

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2378.91,1839.72,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,719.927,719.926,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1248.85,1179.6,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,761.179,761.026,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,890.691,804.558,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,176.764,176.555,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2283.78,2283.44,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.257,188.251,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2322.23,2321.72,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,193.398,193.393,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4570.14,4562.81,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,251.142,251.136,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2862.3,2856.53,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.407,286.4,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.133,275.732,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,211.33,211.326,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,255.465,254.147,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,236.268,236.261,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,923.541,913.159,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,968.169,968.156,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,33763.5,31527,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4248.98,4248.82,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,137715,87126.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,29982.9,29982.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42468e+06,1.42402e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,161234,161229,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42937e+06,1.42864e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,148795,148794,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,41387.7,40707,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,14614.4,14614.1,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,71084.1,70736.5,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25142,25141.5,ns,,,,,

alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4 glibc

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2622.28,1880.6,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,707.07,707.068,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1408.08,1347.28,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,745.747,745.713,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,833.688,807.323,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,167.247,167.242,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2232.82,2232.55,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,185.935,185.901,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2018.54,2017.57,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,188.994,188.989,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,3670.1,3665.46,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,265.476,265.469,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3416.31,3410.38,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.903,286.899,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,308.735,306.918,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,216.685,216.681,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,249.939,248.893,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,206.93,206.924,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,1154.97,1126.46,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,954.757,954.711,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,30034.2,28619.9,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4118.68,4118.55,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138832,87602.1,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,31016.1,31015.8,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.10821e+06,1.10783e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,153772,153769,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15353e+06,1.15301e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,139405,139401,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,40726.5,40256,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15612.8,15612.4,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,72745.9,72725.6,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,28207.4,28201.6,ns,,,,,

alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1 glibc

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2846.81,1908.99,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,720.212,720.208,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1433.85,1341.02,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,752.119,752.117,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,877.239,806.134,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,178.285,178.282,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,1936.23,1935.28,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.744,188.738,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1703.17,1703.01,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,191.853,191.848,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4213.55,4207.49,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,256.591,256.584,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3739.13,3729.68,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.263,286.256,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.251,277.496,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,210.895,210.896,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,275.913,257.098,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,207.764,207.759,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,884.12,872.664,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,977.932,977.872,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,31343.8,29578.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4124.62,4124.47,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138250,88499.3,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,30428,30427.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15827e+06,1.15788e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,160262,160260,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.16914e+06,1.16709e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,141142,141138,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,42963.6,41992,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15346.3,15345.9,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,70396.8,70376.8,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25490.7,25490.2,ns,,,,,

alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4 glibc

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2846.81,1908.99,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,720.212,720.208,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1433.85,1341.02,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,752.119,752.117,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,877.239,806.134,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,178.285,178.282,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,1936.23,1935.28,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.744,188.738,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1703.17,1703.01,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,191.853,191.848,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4213.55,4207.49,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,256.591,256.584,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3739.13,3729.68,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.263,286.256,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.251,277.496,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,210.895,210.896,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,275.913,257.098,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,207.764,207.759,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,884.12,872.664,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,977.932,977.872,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,31343.8,29578.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4124.62,4124.47,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138250,88499.3,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,30428,30427.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15827e+06,1.15788e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,160262,160260,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.16914e+06,1.16709e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,141142,141138,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,42963.6,41992,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15346.3,15345.9,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,70396.8,70376.8,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25490.7,25490.2,ns,,,,,

alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1 glibc

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2378.91,1839.72,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,719.927,719.926,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1248.85,1179.6,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,761.179,761.026,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,890.691,804.558,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,176.764,176.555,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2283.78,2283.44,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.257,188.251,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2322.23,2321.72,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,193.398,193.393,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4570.14,4562.81,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,251.142,251.136,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2862.3,2856.53,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.407,286.4,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.133,275.732,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,211.33,211.326,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,255.465,254.147,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,236.268,236.261,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,923.541,913.159,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,968.169,968.156,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,33763.5,31527,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4248.98,4248.82,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,137715,87126.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,29982.9,29982.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42468e+06,1.42402e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,161234,161229,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42937e+06,1.42864e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,148795,148794,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,41387.7,40707,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,14614.4,14614.1,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,71084.1,70736.5,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25142,25141.5,ns,,,,,

alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4 os_provider

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2622.28,1880.6,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,707.07,707.068,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1408.08,1347.28,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,745.747,745.713,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,833.688,807.323,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,167.247,167.242,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2232.82,2232.55,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,185.935,185.901,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2018.54,2017.57,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,188.994,188.989,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,3670.1,3665.46,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,265.476,265.469,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3416.31,3410.38,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.903,286.899,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,308.735,306.918,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,216.685,216.681,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,249.939,248.893,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,206.93,206.924,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,1154.97,1126.46,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,954.757,954.711,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,30034.2,28619.9,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4118.68,4118.55,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138832,87602.1,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,31016.1,31015.8,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.10821e+06,1.10783e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,153772,153769,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15353e+06,1.15301e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,139405,139401,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,40726.5,40256,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15612.8,15612.4,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,72745.9,72725.6,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,28207.4,28201.6,ns,,,,,

alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1 os_provider

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2378.91,1839.72,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,719.927,719.926,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1248.85,1179.6,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,761.179,761.026,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,890.691,804.558,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,176.764,176.555,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2283.78,2283.44,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.257,188.251,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2322.23,2321.72,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,193.398,193.393,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4570.14,4562.81,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,251.142,251.136,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2862.3,2856.53,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.407,286.4,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.133,275.732,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,211.33,211.326,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,255.465,254.147,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,236.268,236.261,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,923.541,913.159,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,968.169,968.156,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,33763.5,31527,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4248.98,4248.82,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,137715,87126.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,29982.9,29982.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42468e+06,1.42402e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,161234,161229,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42937e+06,1.42864e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,148795,148794,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,41387.7,40707,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,14614.4,14614.1,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,71084.1,70736.5,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25142,25141.5,ns,,,,,

alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4 os_provider

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2622.28,1880.6,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,707.07,707.068,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1408.08,1347.28,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,745.747,745.713,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,833.688,807.323,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,167.247,167.242,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2232.82,2232.55,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,185.935,185.901,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2018.54,2017.57,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,188.994,188.989,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,3670.1,3665.46,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,265.476,265.469,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3416.31,3410.38,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.903,286.899,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,308.735,306.918,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,216.685,216.681,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,249.939,248.893,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,206.93,206.924,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,1154.97,1126.46,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,954.757,954.711,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,30034.2,28619.9,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4118.68,4118.55,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138832,87602.1,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,31016.1,31015.8,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.10821e+06,1.10783e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,153772,153769,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15353e+06,1.15301e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,139405,139401,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,40726.5,40256,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15612.8,15612.4,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,72745.9,72725.6,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,28207.4,28201.6,ns,,,,,

alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1 os_provider

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2846.81,1908.99,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,720.212,720.208,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1433.85,1341.02,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,752.119,752.117,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,877.239,806.134,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,178.285,178.282,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,1936.23,1935.28,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.744,188.738,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1703.17,1703.01,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,191.853,191.848,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4213.55,4207.49,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,256.591,256.584,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3739.13,3729.68,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.263,286.256,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.251,277.496,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,210.895,210.896,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,275.913,257.098,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,207.764,207.759,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,884.12,872.664,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,977.932,977.872,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,31343.8,29578.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4124.62,4124.47,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138250,88499.3,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,30428,30427.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15827e+06,1.15788e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,160262,160260,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.16914e+06,1.16709e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,141142,141138,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,42963.6,41992,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15346.3,15345.9,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,70396.8,70376.8,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25490.7,25490.2,ns,,,,,

alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4 proxy_pool

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2846.81,1908.99,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,720.212,720.208,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1433.85,1341.02,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,752.119,752.117,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,877.239,806.134,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,178.285,178.282,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,1936.23,1935.28,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.744,188.738,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1703.17,1703.01,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,191.853,191.848,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4213.55,4207.49,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,256.591,256.584,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3739.13,3729.68,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.263,286.256,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.251,277.496,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,210.895,210.896,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,275.913,257.098,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,207.764,207.759,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,884.12,872.664,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,977.932,977.872,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,31343.8,29578.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4124.62,4124.47,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138250,88499.3,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,30428,30427.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15827e+06,1.15788e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,160262,160260,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.16914e+06,1.16709e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,141142,141138,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,42963.6,41992,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15346.3,15345.9,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,70396.8,70376.8,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25490.7,25490.2,ns,,,,,

alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1 proxy_pool

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2846.81,1908.99,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,720.212,720.208,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1433.85,1341.02,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,752.119,752.117,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,877.239,806.134,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,178.285,178.282,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,1936.23,1935.28,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.744,188.738,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1703.17,1703.01,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,191.853,191.848,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4213.55,4207.49,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,256.591,256.584,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3739.13,3729.68,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.263,286.256,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.251,277.496,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,210.895,210.896,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,275.913,257.098,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,207.764,207.759,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,884.12,872.664,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,977.932,977.872,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,31343.8,29578.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4124.62,4124.47,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138250,88499.3,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,30428,30427.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15827e+06,1.15788e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,160262,160260,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.16914e+06,1.16709e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,141142,141138,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,42963.6,41992,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15346.3,15345.9,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,70396.8,70376.8,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25490.7,25490.2,ns,,,,,

alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4 proxy_pool

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2622.28,1880.6,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,707.07,707.068,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1408.08,1347.28,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,745.747,745.713,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,833.688,807.323,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,167.247,167.242,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2232.82,2232.55,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,185.935,185.901,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2018.54,2017.57,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,188.994,188.989,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,3670.1,3665.46,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,265.476,265.469,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3416.31,3410.38,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.903,286.899,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,308.735,306.918,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,216.685,216.681,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,249.939,248.893,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,206.93,206.924,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,1154.97,1126.46,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,954.757,954.711,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,30034.2,28619.9,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4118.68,4118.55,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138832,87602.1,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,31016.1,31015.8,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.10821e+06,1.10783e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,153772,153769,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15353e+06,1.15301e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,139405,139401,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,40726.5,40256,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15612.8,15612.4,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,72745.9,72725.6,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,28207.4,28201.6,ns,,,,,

alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1 proxy_pool

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2378.91,1839.72,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,719.927,719.926,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1248.85,1179.6,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,761.179,761.026,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,890.691,804.558,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,176.764,176.555,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2283.78,2283.44,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.257,188.251,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2322.23,2321.72,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,193.398,193.393,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4570.14,4562.81,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,251.142,251.136,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2862.3,2856.53,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.407,286.4,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.133,275.732,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,211.33,211.326,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,255.465,254.147,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,236.268,236.261,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,923.541,913.159,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,968.169,968.156,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,33763.5,31527,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4248.98,4248.82,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,137715,87126.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,29982.9,29982.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42468e+06,1.42402e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,161234,161229,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42937e+06,1.42864e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,148795,148794,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,41387.7,40707,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,14614.4,14614.1,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,71084.1,70736.5,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25142,25141.5,ns,,,,,

alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4 scalable_pool

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2846.81,1908.99,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,720.212,720.208,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1433.85,1341.02,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,752.119,752.117,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,877.239,806.134,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,178.285,178.282,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,1936.23,1935.28,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.744,188.738,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1703.17,1703.01,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,191.853,191.848,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4213.55,4207.49,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,256.591,256.584,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3739.13,3729.68,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.263,286.256,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.251,277.496,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,210.895,210.896,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,275.913,257.098,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,207.764,207.759,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,884.12,872.664,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,977.932,977.872,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,31343.8,29578.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4124.62,4124.47,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138250,88499.3,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,30428,30427.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15827e+06,1.15788e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,160262,160260,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.16914e+06,1.16709e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,141142,141138,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,42963.6,41992,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15346.3,15345.9,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,70396.8,70376.8,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25490.7,25490.2,ns,,,,,

alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1 scalable_pool

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2378.91,1839.72,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,719.927,719.926,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1248.85,1179.6,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,761.179,761.026,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,890.691,804.558,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,176.764,176.555,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2283.78,2283.44,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.257,188.251,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2322.23,2321.72,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,193.398,193.393,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4570.14,4562.81,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,251.142,251.136,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2862.3,2856.53,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.407,286.4,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.133,275.732,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,211.33,211.326,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,255.465,254.147,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,236.268,236.261,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,923.541,913.159,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,968.169,968.156,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,33763.5,31527,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4248.98,4248.82,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,137715,87126.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,29982.9,29982.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42468e+06,1.42402e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,161234,161229,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42937e+06,1.42864e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,148795,148794,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,41387.7,40707,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,14614.4,14614.1,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,71084.1,70736.5,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25142,25141.5,ns,,,,,

alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4 scalable_pool

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2378.91,1839.72,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,719.927,719.926,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1248.85,1179.6,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,761.179,761.026,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,890.691,804.558,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,176.764,176.555,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2283.78,2283.44,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.257,188.251,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2322.23,2321.72,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,193.398,193.393,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4570.14,4562.81,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,251.142,251.136,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2862.3,2856.53,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.407,286.4,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.133,275.732,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,211.33,211.326,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,255.465,254.147,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,236.268,236.261,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,923.541,913.159,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,968.169,968.156,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,33763.5,31527,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4248.98,4248.82,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,137715,87126.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,29982.9,29982.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42468e+06,1.42402e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,161234,161229,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42937e+06,1.42864e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,148795,148794,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,41387.7,40707,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,14614.4,14614.1,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,71084.1,70736.5,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25142,25141.5,ns,,,,,

alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1 scalable_pool

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2846.81,1908.99,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,720.212,720.208,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1433.85,1341.02,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,752.119,752.117,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,877.239,806.134,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,178.285,178.282,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,1936.23,1935.28,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.744,188.738,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1703.17,1703.01,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,191.853,191.848,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4213.55,4207.49,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,256.591,256.584,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3739.13,3729.68,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.263,286.256,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.251,277.496,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,210.895,210.896,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,275.913,257.098,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,207.764,207.759,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,884.12,872.664,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,977.932,977.872,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,31343.8,29578.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4124.62,4124.47,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138250,88499.3,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,30428,30427.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15827e+06,1.15788e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,160262,160260,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.16914e+06,1.16709e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,141142,141138,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,42963.6,41992,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15346.3,15345.9,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,70396.8,70376.8,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25490.7,25490.2,ns,,,,,

alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4 scalable_pool

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2378.91,1839.72,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,719.927,719.926,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1248.85,1179.6,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,761.179,761.026,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,890.691,804.558,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,176.764,176.555,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2283.78,2283.44,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.257,188.251,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2322.23,2321.72,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,193.398,193.393,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4570.14,4562.81,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,251.142,251.136,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2862.3,2856.53,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.407,286.4,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.133,275.732,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,211.33,211.326,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,255.465,254.147,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,236.268,236.261,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,923.541,913.159,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,968.169,968.156,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,33763.5,31527,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4248.98,4248.82,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,137715,87126.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,29982.9,29982.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42468e+06,1.42402e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,161234,161229,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42937e+06,1.42864e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,148795,148794,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,41387.7,40707,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,14614.4,14614.1,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,71084.1,70736.5,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25142,25141.5,ns,,,,,

alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1 scalable_pool

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2378.91,1839.72,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,719.927,719.926,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1248.85,1179.6,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,761.179,761.026,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,890.691,804.558,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,176.764,176.555,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2283.78,2283.44,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.257,188.251,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2322.23,2321.72,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,193.398,193.393,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4570.14,4562.81,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,251.142,251.136,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2862.3,2856.53,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.407,286.4,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.133,275.732,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,211.33,211.326,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,255.465,254.147,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,236.268,236.261,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,923.541,913.159,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,968.169,968.156,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,33763.5,31527,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4248.98,4248.82,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,137715,87126.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,29982.9,29982.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42468e+06,1.42402e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,161234,161229,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42937e+06,1.42864e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,148795,148794,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,41387.7,40707,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,14614.4,14614.1,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,71084.1,70736.5,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25142,25141.5,ns,,,,,

multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4 glibc

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2846.81,1908.99,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,720.212,720.208,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1433.85,1341.02,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,752.119,752.117,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,877.239,806.134,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,178.285,178.282,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,1936.23,1935.28,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.744,188.738,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1703.17,1703.01,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,191.853,191.848,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4213.55,4207.49,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,256.591,256.584,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3739.13,3729.68,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.263,286.256,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.251,277.496,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,210.895,210.896,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,275.913,257.098,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,207.764,207.759,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,884.12,872.664,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,977.932,977.872,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,31343.8,29578.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4124.62,4124.47,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138250,88499.3,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,30428,30427.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15827e+06,1.15788e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,160262,160260,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.16914e+06,1.16709e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,141142,141138,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,42963.6,41992,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15346.3,15345.9,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,70396.8,70376.8,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25490.7,25490.2,ns,,,,,

multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1 glibc

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2846.81,1908.99,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,720.212,720.208,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1433.85,1341.02,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,752.119,752.117,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,877.239,806.134,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,178.285,178.282,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,1936.23,1935.28,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.744,188.738,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1703.17,1703.01,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,191.853,191.848,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4213.55,4207.49,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,256.591,256.584,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3739.13,3729.68,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.263,286.256,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.251,277.496,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,210.895,210.896,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,275.913,257.098,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,207.764,207.759,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,884.12,872.664,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,977.932,977.872,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,31343.8,29578.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4124.62,4124.47,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138250,88499.3,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,30428,30427.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15827e+06,1.15788e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,160262,160260,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.16914e+06,1.16709e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,141142,141138,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,42963.6,41992,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15346.3,15345.9,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,70396.8,70376.8,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25490.7,25490.2,ns,,,,,

multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4 glibc

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2846.81,1908.99,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,720.212,720.208,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1433.85,1341.02,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,752.119,752.117,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,877.239,806.134,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,178.285,178.282,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,1936.23,1935.28,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.744,188.738,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1703.17,1703.01,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,191.853,191.848,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4213.55,4207.49,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,256.591,256.584,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3739.13,3729.68,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.263,286.256,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.251,277.496,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,210.895,210.896,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,275.913,257.098,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,207.764,207.759,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,884.12,872.664,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,977.932,977.872,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,31343.8,29578.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4124.62,4124.47,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138250,88499.3,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,30428,30427.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15827e+06,1.15788e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,160262,160260,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.16914e+06,1.16709e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,141142,141138,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,42963.6,41992,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15346.3,15345.9,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,70396.8,70376.8,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25490.7,25490.2,ns,,,,,

multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1 glibc

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2846.81,1908.99,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,720.212,720.208,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1433.85,1341.02,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,752.119,752.117,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,877.239,806.134,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,178.285,178.282,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,1936.23,1935.28,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.744,188.738,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1703.17,1703.01,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,191.853,191.848,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4213.55,4207.49,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,256.591,256.584,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3739.13,3729.68,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.263,286.256,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.251,277.496,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,210.895,210.896,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,275.913,257.098,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,207.764,207.759,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,884.12,872.664,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,977.932,977.872,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,31343.8,29578.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4124.62,4124.47,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138250,88499.3,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,30428,30427.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15827e+06,1.15788e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,160262,160260,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.16914e+06,1.16709e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,141142,141138,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,42963.6,41992,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15346.3,15345.9,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,70396.8,70376.8,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25490.7,25490.2,ns,,,,,

multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4 proxy_pool

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2846.81,1908.99,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,720.212,720.208,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1433.85,1341.02,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,752.119,752.117,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,877.239,806.134,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,178.285,178.282,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,1936.23,1935.28,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.744,188.738,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1703.17,1703.01,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,191.853,191.848,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4213.55,4207.49,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,256.591,256.584,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3739.13,3729.68,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.263,286.256,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.251,277.496,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,210.895,210.896,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,275.913,257.098,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,207.764,207.759,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,884.12,872.664,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,977.932,977.872,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,31343.8,29578.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4124.62,4124.47,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138250,88499.3,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,30428,30427.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15827e+06,1.15788e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,160262,160260,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.16914e+06,1.16709e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,141142,141138,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,42963.6,41992,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15346.3,15345.9,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,70396.8,70376.8,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25490.7,25490.2,ns,,,,,

multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1 proxy_pool

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2846.81,1908.99,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,720.212,720.208,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1433.85,1341.02,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,752.119,752.117,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,877.239,806.134,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,178.285,178.282,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,1936.23,1935.28,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.744,188.738,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1703.17,1703.01,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,191.853,191.848,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4213.55,4207.49,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,256.591,256.584,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3739.13,3729.68,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.263,286.256,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.251,277.496,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,210.895,210.896,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,275.913,257.098,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,207.764,207.759,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,884.12,872.664,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,977.932,977.872,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,31343.8,29578.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4124.62,4124.47,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138250,88499.3,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,30428,30427.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15827e+06,1.15788e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,160262,160260,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.16914e+06,1.16709e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,141142,141138,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,42963.6,41992,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15346.3,15345.9,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,70396.8,70376.8,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25490.7,25490.2,ns,,,,,

multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4 os_provider

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2846.81,1908.99,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,720.212,720.208,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1433.85,1341.02,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,752.119,752.117,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,877.239,806.134,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,178.285,178.282,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,1936.23,1935.28,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.744,188.738,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1703.17,1703.01,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,191.853,191.848,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4213.55,4207.49,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,256.591,256.584,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3739.13,3729.68,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.263,286.256,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.251,277.496,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,210.895,210.896,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,275.913,257.098,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,207.764,207.759,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,884.12,872.664,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,977.932,977.872,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,31343.8,29578.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4124.62,4124.47,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138250,88499.3,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,30428,30427.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15827e+06,1.15788e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,160262,160260,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.16914e+06,1.16709e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,141142,141138,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,42963.6,41992,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15346.3,15345.9,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,70396.8,70376.8,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25490.7,25490.2,ns,,,,,

multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1 os_provider

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2846.81,1908.99,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,720.212,720.208,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1433.85,1341.02,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,752.119,752.117,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,877.239,806.134,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,178.285,178.282,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,1936.23,1935.28,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.744,188.738,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1703.17,1703.01,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,191.853,191.848,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4213.55,4207.49,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,256.591,256.584,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3739.13,3729.68,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.263,286.256,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.251,277.496,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,210.895,210.896,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,275.913,257.098,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,207.764,207.759,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,884.12,872.664,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,977.932,977.872,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,31343.8,29578.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4124.62,4124.47,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138250,88499.3,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,30428,30427.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15827e+06,1.15788e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,160262,160260,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.16914e+06,1.16709e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,141142,141138,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,42963.6,41992,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15346.3,15345.9,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,70396.8,70376.8,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25490.7,25490.2,ns,,,,,

multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4 scalable_pool

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2378.91,1839.72,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,719.927,719.926,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1248.85,1179.6,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,761.179,761.026,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,890.691,804.558,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,176.764,176.555,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2283.78,2283.44,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.257,188.251,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2322.23,2321.72,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,193.398,193.393,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4570.14,4562.81,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,251.142,251.136,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2862.3,2856.53,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.407,286.4,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.133,275.732,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,211.33,211.326,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,255.465,254.147,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,236.268,236.261,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,923.541,913.159,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,968.169,968.156,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,33763.5,31527,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4248.98,4248.82,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,137715,87126.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,29982.9,29982.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42468e+06,1.42402e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,161234,161229,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42937e+06,1.42864e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,148795,148794,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,41387.7,40707,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,14614.4,14614.1,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,71084.1,70736.5,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25142,25141.5,ns,,,,,

multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1 scalable_pool

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2846.81,1908.99,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,720.212,720.208,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1433.85,1341.02,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,752.119,752.117,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,877.239,806.134,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,178.285,178.282,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,1936.23,1935.28,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.744,188.738,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1703.17,1703.01,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,191.853,191.848,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4213.55,4207.49,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,256.591,256.584,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3739.13,3729.68,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.263,286.256,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.251,277.496,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,210.895,210.896,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,275.913,257.098,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,207.764,207.759,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,884.12,872.664,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,977.932,977.872,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,31343.8,29578.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4124.62,4124.47,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138250,88499.3,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,30428,30427.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15827e+06,1.15788e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,160262,160260,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.16914e+06,1.16709e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,141142,141138,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,42963.6,41992,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15346.3,15345.9,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,70396.8,70376.8,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25490.7,25490.2,ns,,,,,

multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4 scalable_pool

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2378.91,1839.72,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,719.927,719.926,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1248.85,1179.6,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,761.179,761.026,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,890.691,804.558,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,176.764,176.555,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2283.78,2283.44,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.257,188.251,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2322.23,2321.72,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,193.398,193.393,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4570.14,4562.81,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,251.142,251.136,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,2862.3,2856.53,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.407,286.4,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.133,275.732,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,211.33,211.326,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,255.465,254.147,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,236.268,236.261,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,923.541,913.159,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,968.169,968.156,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,33763.5,31527,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4248.98,4248.82,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,137715,87126.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,29982.9,29982.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42468e+06,1.42402e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,161234,161229,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.42937e+06,1.42864e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,148795,148794,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,41387.7,40707,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,14614.4,14614.1,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,71084.1,70736.5,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25142,25141.5,ns,,,,,

multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1 scalable_pool

Environment Variables:

Command:

/home/pmdk/ur-actions-runner/_work/unified-runtime/unified-runtime/umf_build/benchmark/umf-benchmark --benchmark_format=csv

Output:

name,iterations,real_time,cpu_time,time_unit,bytes_per_second,items_per_second,label,error_occurred,error_message
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,2846.81,1908.99,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,720.212,720.208,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1433.85,1341.02,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,752.119,752.117,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,877.239,806.134,ns,,,,,
"glibc/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,178.285,178.282,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,1936.23,1935.28,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,188.744,188.738,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,1703.17,1703.01,ns,,,,,
"os_provider/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,191.853,191.848,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,4213.55,4207.49,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,256.591,256.584,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,3739.13,3729.68,ns,,,,,
"proxy_pool<os_provider>/alloc/max_allocs:1000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,286.263,286.256,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:4",800000,280.251,277.496,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/size:4096/iterations:200000/threads:1",200000,210.895,210.896,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:4",800000,275.913,257.098,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:100000/size:4096/iterations:200000/threads:1",200000,207.764,207.759,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:4",800000,884.12,872.664,ns,,,,,
"scalable_pool<os_provider>/alloc/max_allocs:10000/pre_allocs:0/min size:8/max size:65536/granularity:8/iterations:200000/threads:1",200000,977.932,977.872,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,31343.8,29578.5,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,4124.62,4124.47,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,138250,88499.3,ns,,,,,
"glibc/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,30428,30427.7,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.15827e+06,1.15788e+06,ns,,,,,
"proxy_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,160262,160260,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,1.16914e+06,1.16709e+06,ns,,,,,
"os_provider/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,141142,141138,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:4",8000,42963.6,41992,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/size:4096/iterations:2000/threads:1",2000,15346.3,15345.9,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:4",8000,70396.8,70376.8,ns,,,,,
"scalable_pool<os_provider>/multiple_malloc_free/max_allocs:10000/min size:8/max size:65536/granularity:8/iterations:2000/threads:1",2000,25490.7,25490.2,ns,,,,,

@RossBrunton RossBrunton marked this pull request as ready for review January 7, 2025 12:08
RossBrunton added a commit to RossBrunton/intel-llvm that referenced this pull request Jan 9, 2025
@RossBrunton RossBrunton added the ready to merge Added to PR's which are ready to merge label Jan 9, 2025
Previously the factories used by ur_ldrddi (used when there are multiple
backends) would add newly created objects to a map, but never release
them. This patch adds reference counting semantics to the allocation,
retention and release methods.

A lot of changes were also made to fix use-after-free issues,
specifically:
* The `release` functions now no longer use the handle after freeing
  it.
* `urDeviceTest` no longer frees devices that it dosen't own.
* Some tests for reference counting now explicitly retain an extra
  copy before releasing them.

No tests were added; this should be covered by tests for urThingRetain.

Closes: oneapi-src#1784 .
RossBrunton added a commit to RossBrunton/intel-llvm that referenced this pull request Jan 9, 2025
@RossBrunton RossBrunton merged commit 7eae5c8 into oneapi-src:main Jan 9, 2025
73 checks passed
@RossBrunton RossBrunton deleted the ross/refc branch January 9, 2025 17:28
RossBrunton added a commit to RossBrunton/intel-llvm that referenced this pull request Jan 9, 2025
sarnex pushed a commit to intel/llvm that referenced this pull request Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/cd Continuous integration/devliery common Changes or additions to common utilities conformance Conformance test suite issues. loader Loader related feature/bug ready to merge Added to PR's which are ready to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider properly reference counting and managing loader objects
3 participants