Skip to content

Job processing (introduction)

David Anderson edited this page Jan 15, 2024 · 10 revisions

Failures and retries

A job can fail on a BOINC worker node for a variety of reasons:

  • The application crashes.
  • The user on that node aborts the job.
  • The job exceeds its memory or disk space limits.
  • The job times out.

In some cases the job would succeed on a different node. So BOINC provides a 'retry' mechanism: if a job fails on a node, a second copy (or 'instance') of the job is sent to a different node. This is repeated until an instance succeeds, or until a limit on the number of instances is reached, in which case the job is marked as failing and no further instances are created.

Staging input and output files

Batches

Pipeline components

work generator validator assimilator

Job submission (and file management)

local remote via RPC python bindings remote via web interface

Clone this wiki locally