Skip to content

Perf boost [WIP] #36

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Perf boost [WIP] #36

wants to merge 8 commits into from

Conversation

mchandra
Copy link
Contributor

@mchandra mchandra commented Oct 9, 2017

Transfer from af<->np<->petsc without reordering

Initial testing:

$ AF_OPENCL_DEFAULT_DEVICE_TYPE=CPU python test_af_np_petsc_data_transfer.py 
ArrayFire v3.6.0 (OpenCL, 64-bit Linux, build d9bc8d7)
-0- NVIDIA: Quadro M1000M, 2047 MB
[1] INTEL: Intel(R) Xeon(R) CPU E3-1505M v5 @ 2.80GHz, 31993 MB
---------------------
N_q1 + 2*N_ghost = 70
N_q2 + 2*N_ghost = 134
dof              = 4096
---------------------

af_array_old.shape =  (70, 134, 4096)
af_array_new.shape =  (4096, 70, 134)
 
comm_old =  2.8967 secs/iter
comm_new =  0.8536 secs/iter
 

shyams2 and others added 7 commits September 22, 2017 11:08
…add additional

headers in test files for petsc to read command line arguments.
2) test_compute_electrostatic_fields.py is in flux

Run with : python test_compute_electrostatic_fields.py -ksp_monitor
* First attempt: managed to solve x^2 - 2 = 0 over [63, 63] grid using SNES
  * Periodic BCs work once background density has been subtracted correctly
@mchandra mchandra changed the title Perf boost Perf boost [WIP] Oct 9, 2017
@shyams2
Copy link
Contributor

shyams2 commented Oct 10, 2017

Results when run on savio:

ArrayFire v3.6.0 (OpenCL, 64-bit Linux, build 4a60571)
[0] NVIDIA: Tesla K80, 11439 MB
-1- INTEL: Intel(R) Xeon(R) CPU E5-2623 v3 @ 3.00GHz, 64449 MB
-2- AMD: Intel(R) Xeon(R) CPU E5-2623 v3 @ 3.00GHz, 64449 MB
---------------------
N_q1 + 2*N_ghost = 70
N_q2 + 2*N_ghost = 134
dof              = 4096
---------------------

af_array_old.shape =  (70, 134, 4096)
af_array_new.shape =  (4096, 70, 134)
 
comm_old =  0.7544 secs/iter
comm_new =  0.2407 secs/iter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants