-
Notifications
You must be signed in to change notification settings - Fork 108
JobSplitting Argorithms
Job splitting is to make the job length for optimal use of resources (~8 hours in default) - There are various parameters are used to calculate approximate job length (most importantly TimePerEvent)
The main parameter used for job splitting is "events_per_job" (in splitting algo). This is set in the Spec(EventsPerJob)/Splitting Algorithm, if this value is not set it will be calculated by "TimePerEvent". If "events_per_job" is specified "TimePerEvent" is ignored for job splitting (only used for estimated JobTime)
events_per_job = int((8.0 * 3600.0) / timePerEvent)
EventAwareLumiBased algorithm (code)
-
It converts events_per_job to lumis_per_job then create the job by iterating through files in the same location If the job cannot be created on multiple input files. "halt_job_on_file_boundaries == True"
For each file f, if file has events f['avgEvtsPerLumi'] = round(float(f['events'])/f['lumiCount']) lumisPerJob = events_per_job / f['avgEvtsPerLumi'] if file has 0 event, lumisPerJob = f['lumiCount']
If the job can be created over multiple input files, Add more files until event in the job reaches to events_per_job. (also converting events_per_job to lumisPerJob (code)
-
There is a case that a job is created but make it fail right away.
If an inputfile has only one lumi and avgEvtsPerLumi (events in the file/lumis in the file) is bigger than max_events_per_lumi (default 20K) - fail this job (on creation).