-
Notifications
You must be signed in to change notification settings - Fork 1
EgeeMpiJobs
The actual state of MPI support over EGEE is not very good, even so we can send Jobs to specified sites that works relative well, at least inside the biomed virtual organization.
You can obtain more information, in the EGEE MPI wiki at grid ireland web site:
http://www.grid.ie/mpi/wiki/FrontPage
Or in the mailing list related:
project-eu-egee-tcg-mpi@cern.ch
We have developed script written in the language python that makes easier the creation of files .jdl and of shell scripts for the execution of jobs using the advantages that interface MPI offers.
From my point of view certain effort on the part of the developers and of the users is becoming so that the support of this technology matures within the EGEE infrastructure.
The name of script is mpi_jobs_creator and is installed in our machine villon (villon.cnb.uam.es). In order to be able to invoke it is recommendable before to add to the route in path of our file .bashrc
export PATH=/opt/xmipp:$PATH source .bashrc
Once we have done this, we will be able to invoke it without problem. The output that we will obtain will be the following one:
mpi_jobs_creator error: -virtual_organization|-vo parameter is required error: -executable|-exe parameter is required (absolut path of your MPI program) error: -number_CPU|-nCPU parameter is required[-help|-h] help -virtual_organization|-vo virtual organization -executable|-exe executable [-arguments|-args] arguments -number_CPU|-nCPU number CPU [-input_data|-id] input data [-publish|-p] data catalog publication [-output_data|-od] output data [-result_rule|-rl] [-computing_element|-ce] computing element [-storage_element|-se] storage element [-catalog_in_path| -caip] catalog input path [-catalog_out_path| -caop] catalog output path [-root_name|-o] root name TODO: [-launch|-l] launch job generated retrive output data -> catalog used [-retrive|-r] retrieve output data
In order to write this script it was taken as it bases a simple example that uses a program to obtain approaches of the number pi, called cpi.
We are going to create the pair necessary of files executing mpi_jobs_creator script.
Firstly I'm going to execute the following command and later I will explain each one of the parameters used.
mpi_jobs_creator -vo biomed -nCPU 8 -exe /home/user/cpi -od test.tgz -rl "test*" success: the files mpi_job.jdl and mpi_job.sh have been created
If everything has gone well mpi_jobs_creator will have generated the files: mpi_job.jdl and mpi_job.sh
With the parameter - vo, we are indicating the virtual organization to whom we belong so that we pruned to later obtain data on its resources.
(eg:) -vo biomedWith the parameter -nCPU, we are indicating the number of CPU that we want to request.
(eg:) -nCPU 8With the parameter -exe, we are indicating the absolut path of our mpi executable file.
(eg:) -exe /home/user/cpiWith the parameter -od, we are indicating the name of the output data file compressed in tgz format that have to contain the result files of the execution.
(eg:) -od test.tgzWith the parameter -rl, we are indicating a rule that going to serve to shell script file generated to add in the output file compressed in the format tgz all the occurrences that agree with the given rule.
(eg:) -rl "test*"
In order to be able to execute job mpi that we have generated using mpi_jobs_creator a good practice is to consult previously what sites are more suitable for it. This command will return us the list requested.(eg:) edg-job-list-match mpi_job.jdl
Some of sites given back is not adapted to run our MPI jobs, for that reason often the experience with some of them is valueable.(eg:) edg-job-list-match mpi_job.jdl
With the following command we will obtain some more information about the sites of our virtual organization. The information that gives back to us includes, among other free things the number of CPU's in the second column. (eg:) lcg-infosites --vo biomed ce
valor del bdii: lcg-bdii.cern.ch:2170
#CPU Free Total Jobs Running Waiting [[ComputingElement]]
Using combined results we can consult the state of some of the sites given back by eg-job-list-match using the grep command. See the following example.
lcg-infosites --vo biomed ce | grep site.chosen.com:2119/jobmanager-lcgpbs-biomedNow we can send job to the chosen site, for it we will execute the following command.
edg-job-submit -r site.chosen.com:2119/jobmanager-lcgpbs-biomed mpi_job.jdlWe can check the status of our job using the following command, the input is the id of the job returned in the previous command.
edg-job-status id_of_our_jobWhen ours job has finalized we can gather the result using the following command.
edg-job-get-output --dir . id_of_our_jobIn my case, after observing the output files I have obtained something like that.
Modified mpirun: Executing command: /home/bio058/gram_scratch_OzQpkeIMDL/.mpi/https_3a_2f_2fbioinfo02.pcm.uam.es_3a9000_2fK-7Q0pFzumJWYDtYl_5flyig/cpi Process 0 of 1 on grid46.lal.in2p3.fr pi is approximately 3.1415926544231341, Error is 0.0000000008333410 wall clock time = 10.004210Of course, you can use your personal tricks or your expertise with the aim to discover the best ways to execute your applications.
-- Main.GermanCarrera - 11 Apr 2007