-
Notifications
You must be signed in to change notification settings - Fork 707
Sanitize labels in AWS batch #6211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -574,7 +574,7 @@ class AwsBatchTaskHandler extends TaskHandler implements BatchHandler<String,Job | |||||||||||||||||||
|
||||||||||||||||||||
if( opts.executionRole ) | ||||||||||||||||||||
container.setExecutionRoleArn(opts.executionRole) | ||||||||||||||||||||
|
||||||||||||||||||||
final logsGroup = opts.getLogsGroup() | ||||||||||||||||||||
if( logsGroup ) | ||||||||||||||||||||
container.setLogConfiguration(getLogConfiguration(logsGroup, opts.getRegion())) | ||||||||||||||||||||
|
@@ -623,7 +623,7 @@ class AwsBatchTaskHandler extends TaskHandler implements BatchHandler<String,Job | |||||||||||||||||||
return result | ||||||||||||||||||||
} | ||||||||||||||||||||
|
||||||||||||||||||||
@Memoized | ||||||||||||||||||||
@Memoized | ||||||||||||||||||||
LogConfiguration getLogConfiguration(String name, String region) { | ||||||||||||||||||||
new LogConfiguration() | ||||||||||||||||||||
.withLogDriver('awslogs') | ||||||||||||||||||||
|
@@ -765,7 +765,7 @@ class AwsBatchTaskHandler extends TaskHandler implements BatchHandler<String,Job | |||||||||||||||||||
result.setJobQueue(getJobQueue(task)) | ||||||||||||||||||||
result.setJobDefinition(getJobDefinition(task)) | ||||||||||||||||||||
if( labels ) { | ||||||||||||||||||||
result.setTags(labels) | ||||||||||||||||||||
result.setTags(sanitizeAwsBatchLabels(labels)) | ||||||||||||||||||||
result.setPropagateTags(true) | ||||||||||||||||||||
} | ||||||||||||||||||||
// set the share identifier | ||||||||||||||||||||
|
@@ -849,6 +849,70 @@ class AwsBatchTaskHandler extends TaskHandler implements BatchHandler<String,Job | |||||||||||||||||||
return result | ||||||||||||||||||||
} | ||||||||||||||||||||
|
||||||||||||||||||||
/** | ||||||||||||||||||||
* Sanitize resource labels to comply with AWS Batch tag requirements. | ||||||||||||||||||||
* AWS Batch tags have specific constraints: | ||||||||||||||||||||
* - Keys and values can contain: letters, numbers, spaces, and characters: _ . : / = + - @ | ||||||||||||||||||||
* - Maximum key length: 128 characters | ||||||||||||||||||||
* - Maximum value length: 256 characters | ||||||||||||||||||||
* | ||||||||||||||||||||
* @param labels The original resource labels map | ||||||||||||||||||||
* @return A new map with sanitized labels suitable for AWS Batch tags | ||||||||||||||||||||
*/ | ||||||||||||||||||||
protected Map<String, String> sanitizeAwsBatchLabels(Map<String, String> labels) { | ||||||||||||||||||||
if (!labels) return labels | ||||||||||||||||||||
|
||||||||||||||||||||
final result = new LinkedHashMap<String, String>() | ||||||||||||||||||||
|
||||||||||||||||||||
for (Map.Entry<String, String> entry : labels.entrySet()) { | ||||||||||||||||||||
final key = entry.getKey() | ||||||||||||||||||||
final value = entry.getValue() | ||||||||||||||||||||
|
||||||||||||||||||||
if (key != null && value != null) { | ||||||||||||||||||||
final sanitizedKey = sanitizeAwsBatchLabel(key.toString(), 128) | ||||||||||||||||||||
final sanitizedValue = sanitizeAwsBatchLabel(value.toString(), 256) | ||||||||||||||||||||
|
||||||||||||||||||||
// Only add non-empty keys and values | ||||||||||||||||||||
if (sanitizedKey && sanitizedValue) { | ||||||||||||||||||||
result.put(sanitizedKey, sanitizedValue) | ||||||||||||||||||||
} | ||||||||||||||||||||
Comment on lines
+875
to
+878
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This will silently drop keys and values that can't be sanitised, which is a bit bad in terms of UX. At a minimum, I would raise a warning here, but I might be tempted to throw an error and tell the user to specify a valid label. |
||||||||||||||||||||
} | ||||||||||||||||||||
} | ||||||||||||||||||||
|
||||||||||||||||||||
return result | ||||||||||||||||||||
} | ||||||||||||||||||||
|
||||||||||||||||||||
/** | ||||||||||||||||||||
* Sanitize a single label key or value for AWS Batch tags. | ||||||||||||||||||||
* Replaces invalid characters with underscores and truncates if necessary. | ||||||||||||||||||||
* | ||||||||||||||||||||
* @param input The input string to sanitize | ||||||||||||||||||||
* @param maxLength The maximum allowed length | ||||||||||||||||||||
* @return The sanitized string | ||||||||||||||||||||
*/ | ||||||||||||||||||||
protected String sanitizeAwsBatchLabel(String input, int maxLength) { | ||||||||||||||||||||
if (!input) return input | ||||||||||||||||||||
|
||||||||||||||||||||
// Replace invalid characters with underscores | ||||||||||||||||||||
// AWS Batch allows: letters, numbers, spaces, and: _ . : / = + - @ | ||||||||||||||||||||
String sanitized = input.replaceAll(/[^a-zA-Z0-9\s_.\:\/=+\-@]/, '_') | ||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Untested, but I think you can simplify this to:
Suggested change
you might also be able to fold the length into a single chain:
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Of course changing the order affects the output - you may want to replaceAll then take based on what you want. Searching through the codebase it tends to be a ternary with substring: nextflow/modules/nextflow/src/main/groovy/nextflow/executor/AbstractGridExecutor.groovy Lines 182 to 183 in 6231d76
|
||||||||||||||||||||
|
||||||||||||||||||||
// Replace multiple consecutive underscores/spaces with single underscore | ||||||||||||||||||||
sanitized = sanitized.replaceAll(/[_\s]{2,}/, '_') | ||||||||||||||||||||
|
||||||||||||||||||||
// Remove leading/trailing underscores and spaces | ||||||||||||||||||||
sanitized = sanitized.replaceAll(/^[_\s]+|[_\s]+$/, '') | ||||||||||||||||||||
|
||||||||||||||||||||
// Truncate if too long | ||||||||||||||||||||
if (sanitized.length() > maxLength) { | ||||||||||||||||||||
sanitized = sanitized.substring(0, maxLength) | ||||||||||||||||||||
// Remove trailing underscore/space after truncation | ||||||||||||||||||||
sanitized = sanitized.replaceAll(/[_\s]+$/, '') | ||||||||||||||||||||
} | ||||||||||||||||||||
Comment on lines
+900
to
+911
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You could probably simplify this a little bit by trimming first then replacing (see chain example above). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we want to simplify or make it readable? I'm all for clever code, but I want to make it clear what we're doing and trying to overcome here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Replacing the same variable a bunch of times can be hard to read as well because it's difficult to tell what the variable is at any one time. I wouldn't stack all of these changes into one chain but I would consider simplifying the code a bit. The best idea is to break it down into discrete chunks of obvious code, although that's not always easy to do. |
||||||||||||||||||||
|
||||||||||||||||||||
return sanitized ?: null | ||||||||||||||||||||
} | ||||||||||||||||||||
|
||||||||||||||||||||
/** | ||||||||||||||||||||
* @return The list of environment variables to be defined in the Batch job execution context | ||||||||||||||||||||
*/ | ||||||||||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the value is
null
, this is skipped. Currently, when the item is"item": null
is the aws tag silently dropped?