Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions docs/aurora/running-jobs-aurora.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@ There are four production queues you can target in your qsub (`-q <queue name>`)
|---------------|----------|----------|----------|----------|------------------------------------------------------------------------------------------------------|
| debug | 1 | 2 | 5 min | 1 hr | 64 nodes (non-exclusive); <br/> Max 1 job running/accruing/queued **per-user** |
| debug-scaling | 2 | 31 | 5 min | 1 hr | Max 1 job running/accruing/queued **per-user**
| prod | 1 | 8211-8576* | 5 min | 24 hrs | Routing queue for tiny, small, medium, and large queues; <br/> **See table below for min/max limits**|
| prod-large | 1920 | 8211-8576* | 5 min | 24 hrs | Routing queue for large jobs;
| legacy | 1 | 2048-2413 | 5 min | 24 hrs | Legacy routing queue with old (prior to 10/13/25) AuroraSDK. Some projects have higher priority.
| prod | 1 | 8498* | 5 min | 24 hrs | Routing queue for tiny, small, medium, and large queues; <br/> **See table below for min/max limits**|
| prod-large | 1920 | 8498* | 5 min | 24 hrs | Routing queue for large jobs;
| legacy | 1 | 2126 | 5 min | 24 hrs | Legacy routing queue with old (prior to 10/13/25) AuroraSDK. Some projects have higher priority.
| visualization | 1 | 32 | 5 min | 8 hrs | ***By request only; non-exclusive nodes*** |


Expand All @@ -22,11 +22,11 @@ There are four production queues you can target in your qsub (`-q <queue name>`)
| tiny | 1 | 512 | 5 min | 6 hrs | |
| small | 513 | 1024 | 5 min | 12 hrs | |
| medium | 1025 | 1919 | 5 min | 18 hrs | |
| large | 1920 | 8211-8576* | 5 min | 24 hrs | range for stable max nodecount until new image is deployed across all the nodes
| large | 1920 | 8498* | 5 min | 24 hrs | theoretical max; stable max nodecount may vary
| backfill-tiny | 1 | 512 | 5 min | 6 hrs | Low priority, negative project balance |
| backfill-small | 513 | 1024 | 5 min | 12 hrs | Low priority, negative project balance |
| backfill-medium | 1025 | 1919 | 5 min | 18 hrs | Low priority, negative project balance |
| backfill-large | 1920 | 7300-7500* | 5 min | 24 hrs | Low priority, negative project balance; range for stable max nodecount until new image is deployed across all the nodes |
| backfill-large | 1920 | 8498* | 5 min | 24 hrs | Low priority, negative project balance; theoretical max; stable max nodecount may vary |

!!! warning

Expand Down
8 changes: 8 additions & 0 deletions docs/aurora/system-updates.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
# Aurora System Updates

## 2025-10-13
The compute image with Intel's User (UMD) and Kernel Mode Drivers (KMD) (Agama 1146.12 / rolling release 2523.12), and oneAPI 2025.2.0, which was previously available in next-eval queue, is rolled out to the majority of nodes across Aurora. 2,126 nodes have the old production image and are available in a queue called `legacy`, which will be available to all teams that are unable to run against the new image. Some teams will have higher priority to run in the legacy queue. Use aurora-uan-000[7-8] nodes for the `legacy` queue as they will have the same user environment. Users will not be able log in directly to aurora-uan-000[7-8] and will need to ssh to them after logging in to aurora.alcf.anl.gov.

See https://docs.alcf.anl.gov/aurora/running-jobs-aurora/

See sections ["2025-10-07"](system-updates.md/#2025-10-07) and ["2025-09-08"](system-updates.md/#2025-09-08) below for all the change log details.


## 2025-10-07
The image in `next-eval` queue, and uan-0014, has been updated to AuroraSDK version 25.190.0 RC4, with the following changes:

Expand Down