-
Notifications
You must be signed in to change notification settings - Fork 46
Description
Hi Folks,
i tried to compile nws (nano-work-server) for ARM, this mostly works without a problem. The integrated NEON SIMD of Cortex CPUs with ISA ARMv8-A and above works well with nws and computes work in an appropiate time. In my oppinion this is a great part reducing electrical energy consuption and wasting for generating PoW.
I was wondering while ARM Mali GPUs like T- and G-family or Quallcom Adreno GPUs are OpenCL 1.2 and 2.0 compatible but can't compute nws work even getting loaded/recognized as an OpenCL device.
As an Example from a Mali-T830 GPU at termux, where nws always works with CPU:
$ clinfo
Number of platforms 1
Platform Name ARM Platform
Platform Vendor ARM
Platform Version OpenCL 1.2 v1.r12p1-04bet0.1bb9662be2ebee934dcbd7265c794a91
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_fp64 cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_gl_sharing cl_khr_icd cl_khr_egl_event cl_khr_egl_image cl_arm_core_id cl_arm_printf cl_arm_thread_limit_hint cl_arm_non_uniform_work_group_size cl_arm_import_memory
Platform Extensions function suffix ARM
Platform Name ARM Platform
Number of devices 1
Device Name Mali-T830
Device Vendor ARM
Device Vendor ID 0x8301000
Device Version OpenCL 1.2 v1.r12p1-04bet0.1bb9662be2ebee934dcbd7265c794a91
Driver Version 1.2
Device OpenCL C Version OpenCL C 1.2 v1.r12p1-04bet0.1bb9662be2ebee934dcbd7265c794a91
Device Type GPU
Device Available Yes
Device Profile FULL_PROFILE
Max compute units 2
Max clock frequency 360MHz
Device Partition (core)
Max number of sub-devices 0
Supported partition types None
Max work item dimensions 3
Max work item sizes 256x256x256
Max work group size 256
Compiler Available Yes
Linker Available Yes
Preferred work group size multiple 4
Preferred / native vector sizes
char 16 / 16
short 8 / 8
int 4 / 4
long 2 / 2
half 8 / 8 (cl_khr_fp16)
float 4 / 4
double 2 / 2 (cl_khr_fp64)
(...and.so.on...)
So if you want to start nws (with small values):
./nano-work-server --gpu 0:0:256 --gpu-local-work-size 16 (in any combination of other flags and values)
it returns:
thread 'main' panicked at 'Platform::list: Error retrieving platform list: Unable to get platform id list after 10 seconds of waiting.', /data/data/com.termux/files/home/.cargo/registry/src/github.com-1ecc6299db9ec823/ocl-0.19.3/src/standard/platform.rs:50:14
For me this looks like ocl-0.19.3 does not support Mali/Adreno GPUs or others like Raspberry Pi 3 GPU with VC4CL despite of there OpenCL support.
Is it possible to add/change/modify/motherofgod the ocl-0.19.3 crate (or sth like it's platform identifier) to sth which can support GPUs from above?
Thanks for help and advise 💯