Skip to content

JobSplitting Argorithms

ticoann edited this page Apr 12, 2017 · 7 revisions

The goal of job splitting is to make the job length for optimal use of resources (~8 hours) - There are various parameters are used to calculate approximate job length (most importantly TimePerEvent)

The main parameter used for job splitting is "events_per_job" (in splitting algo). This is set in the Spec(EventsPerJob)/Splitting Algorithm, if this value is not set it will be calculated by "TimePerEvent". If "events_per_job" is specified "TimePerEvent" is ignored for job splitting (only used for estimated JobTime

events_per_job = int((8.0 * 3600.0) / timePerEvent)

EventAwareLumiBased algorithm

  1. It converts events_per_job to lumis_per_job f['avgEvtsPerLumi'] = round(float(f['events'])/f['lumiCount']) lumisPerJob = float(avgEventsPerJob) / f['avgEvtsPerLumi'] for 0 events file lumisPerJob = f['lumiCount'] more complicated case https://github.com/dmwm/WMCore/blob/1.1.3.pre2/src/python/WMCore/JobSplitting/EventAwareLumiBased.py#L182

https://github.com/dmwm/WMCore/blob/1.1.3.pre2/src/python/WMCore/JobSplitting/EventAwareLumiBased.py#L238

  1. First determine how many lumis can be in the job depending on various conditions. max_events_per_lumi (default 20K) if inputfile has only one lumi and avgEvtsPerLumi (events in the file/lumis in the file) is bigger than max_events_per_lumi - fail this job (on creation). events_per_job
Clone this wiki locally