diff --git a/CHANGELOG.md b/src/CHANGELOG.md similarity index 100% rename from CHANGELOG.md rename to src/CHANGELOG.md diff --git a/src/README.md b/src/README.md new file mode 100644 index 0000000..75405bd --- /dev/null +++ b/src/README.md @@ -0,0 +1,281 @@ +--- +tags: + - component/datadog-monitor + - layer/datadog + - provider/aws + - provider/datadog +--- + +# Component: `datadog` + +This component is responsible for provisioning Datadog monitors and assigning Datadog roles to the monitors. + +It depends on the `datadog-configuration` component to get the Datadog API keys. + +## Usage + +**Stack Level**: Regional + +Here's an example snippet for how to use this component: + +```yaml +components: + terraform: + datadog-monitor: + settings: + spacelift: + workspace_enabled: true + vars: + enabled: true + local_datadog_monitors_config_paths: + - "catalog/monitors/dev/*.yaml" +``` + +## Conventions + +- Treat datadog like a separate cloud provider with integrations + ([datadog-integration](https://docs.cloudposse.com/components/library/aws/datadog-integration)) into your accounts. + +- Use the `catalog` convention to define a step of alerts. You can use ours or define your own. + [https://github.com/cloudposse/terraform-datadog-platform/tree/master/catalog/monitors](https://github.com/cloudposse/terraform-datadog-platform/tree/master/catalog/monitors) + +- The monitors catalog for the datadog-monitor component support datadog monitor exports. You can use + [the status page of a monitor to export it from 'settings'](https://docs.datadoghq.com/monitors/manage/status/#settings). + You can add the export to existing files or make new ones. Because the export is json formatted, it's also yaml + compatible. If you prefer, you can convert the export to yaml using your text editor or a cli tool like `yq`. + +## Adjust Thresholds per Stack + +Since there are so many parameters that may be adjusted for a given monitor, we define all monitors through YAML. By +convention, we define the **default monitors** that should apply to all environments, and then adjust the thresholds per +environment. This is accomplished using the `datadog-monitor` components variable `local_datadog_monitors_config_paths` +which defines the path to the YAML configuration files. By passing a path for `dev` and `prod`, we can define +configurations that are different per environment. + +For example, you might have the following settings defined for `prod` and `dev` stacks that override the defaults. + +For the `dev` stack: + +``` +components: + terraform: + datadog-monitor: + vars: + # Located in the components/terraform/datadog-monitor directory + local_datadog_monitors_config_paths: + - catalog/monitors/*.yaml + - catalog/monitors/dev/*.yaml # note this line +``` + +For `prod` stack: + +``` +components: + terraform: + datadog-monitor: + vars: + # Located in the components/terraform/datadog-monitor directory + local_datadog_monitors_config_paths: + - catalog/monitors/*.yaml + - catalog/monitors/prod/*.yaml # note this line +``` + +Behind the scenes (with `atmos`) we fetch all files from these glob patterns, template them, and merge them by key. If +we peek into the `*.yaml` and `dev/*.yaml` files above you could see an example like this: + +**components/terraform/datadog-monitor/catalog/monitors/elb.yaml** + +``` +elb-lb-httpcode-5xx-notify: + name: "(ELB) {{ env }} HTTP 5XX client error detected" + type: query alert + query: | + avg(last_15m):max:aws.elb.httpcode_elb_5xx{${context_dd_tags}} by {env,host} > 20 + message: | + [${ dd_env }] [ {{ env }} ] lb:[ {{host}} ] + {{#is_warning}} + Number of HTTP 5XX client error codes generated by the load balancer > {{warn_threshold}}% + {{/is_warning}} + {{#is_alert}} + Number of HTTP 5XX client error codes generated by the load balancer > {{threshold}}% + {{/is_alert}} + Check LB + escalation_message: "" + tags: {} + options: + renotify_interval: 60 + notify_audit: false + require_full_window: true + include_tags: true + timeout_h: 0 + evaluation_delay: 60 + new_host_delay: 300 + new_group_delay: 0 + groupby_simple_monitor: false + renotify_occurrences: 0 + renotify_statuses: [] + validate: true + notify_no_data: false + no_data_timeframe: 5 + priority: 3 + threshold_windows: {} + thresholds: + critical: 50 + warning: 20 + priority: 3 + restricted_roles: null +``` + +**components/terraform/datadog-monitor/catalog/monitors/dev/elb.yaml** + +``` +elb-lb-httpcode-5xx-notify: + query: | + avg(last_15m):max:aws.elb.httpcode_elb_5xx{${context_dd_tags}} by {env,host} > 30 + priority: 2 + options: + thresholds: + critical: 30 + warning: 10 +``` + +## Key Notes + +### Inheritance + +The important thing to note here is that the default yaml is applied to every stage that it's deployed to. For dev +specifically however, we want to override the thresholds and priority for this monitor. This merging is done by key of +the monitor, in this case `elb-lb-httpcode-5xx-notify`. + +### Templating + +The second thing to note is `${ dd_env }`. This is **terraform** templating in action. While double braces (`{{ env }}`) +refers to datadog templating, `${ dd_env }` is a template variable we pass into our monitors. in this example we use it +to specify a grouping int he message. This value is passed in and can be overridden via stacks. + +We pass a value via: + +``` +components: + terraform: + datadog-monitor: + vars: + # Located in the components/terraform/datadog-monitor directory + local_datadog_monitors_config_paths: + - catalog/monitors/*.yaml + - catalog/monitors/dev/*.yaml + # templatefile() is used for all yaml config paths with these variables. + datadog_monitors_config_parameters: + dd_env: "dev" +``` + +This allows us to further use inheritance from stack configuration to keep our monitors dry, but configurable. + +Another available option is to use our catalog as base monitors and then override them with your specific fine tuning. + +``` +components: + terraform: + datadog-monitor: + vars: + local_datadog_monitors_config_paths: + - https://raw.githubusercontent.com/cloudposse/terraform-datadog-platform/0.27.0/catalog/monitors/ec2.yaml + - catalog/monitors/ec2.yaml +``` + +## Other Gotchas + +Our integration action that checks for `'source_type_name' equals 'Monitor Alert'` will also be true for synthetics. +Whereas if we check for `'event_type' equals 'query_alert_monitor'`, that's only true for monitors, because synthetics +will only be picked up by an integration action when `event_type` is `synthetics_alert`. + +This is important if we need to distinguish between monitors and synthetics in OpsGenie, which is the case when we want +to ensure clean messaging on OpsGenie incidents in Statuspage. + + + +## Requirements + +| Name | Version | +|------|---------| +| [terraform](#requirement\_terraform) | >= 1.0.0 | +| [aws](#requirement\_aws) | >= 4.9.0 | +| [datadog](#requirement\_datadog) | >= 3.3.0 | + +## Providers + +No providers. + +## Modules + +| Name | Source | Version | +|------|--------|---------| +| [datadog\_configuration](#module\_datadog\_configuration) | ../datadog-configuration/modules/datadog_keys | n/a | +| [datadog\_monitors](#module\_datadog\_monitors) | cloudposse/platform/datadog//modules/monitors | 1.4.1 | +| [datadog\_monitors\_merge](#module\_datadog\_monitors\_merge) | cloudposse/config/yaml//modules/deepmerge | 1.0.2 | +| [iam\_roles](#module\_iam\_roles) | ../account-map/modules/iam-roles | n/a | +| [local\_datadog\_monitors\_yaml\_config](#module\_local\_datadog\_monitors\_yaml\_config) | cloudposse/config/yaml | 1.0.2 | +| [remote\_datadog\_monitors\_yaml\_config](#module\_remote\_datadog\_monitors\_yaml\_config) | cloudposse/config/yaml | 1.0.2 | +| [this](#module\_this) | cloudposse/label/null | 0.25.0 | + +## Resources + +No resources. + +## Inputs + +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|:--------:| +| [additional\_tag\_map](#input\_additional\_tag\_map) | Additional key-value pairs to add to each map in `tags_as_list_of_maps`. Not added to `tags` or `id`.
This is for some rare cases where resources want additional configuration of tags
and therefore take a list of maps with tag key, value, and additional configuration. | `map(string)` | `{}` | no | +| [alert\_tags](#input\_alert\_tags) | List of alert tags to add to all alert messages, e.g. `["@opsgenie"]` or `["@devops", "@opsgenie"]` | `list(string)` | `null` | no | +| [alert\_tags\_separator](#input\_alert\_tags\_separator) | Separator for the alert tags. All strings from the `alert_tags` variable will be joined into one string using the separator and then added to the alert message | `string` | `"\n"` | no | +| [attributes](#input\_attributes) | ID element. Additional attributes (e.g. `workers` or `cluster`) to add to `id`,
in the order they appear in the list. New attributes are appended to the
end of the list. The elements of the list are joined by the `delimiter`
and treated as a single ID element. | `list(string)` | `[]` | no | +| [context](#input\_context) | Single object for setting entire context at once.
See description of individual variables for details.
Leave string and numeric variables as `null` to use default value.
Individual variable settings (non-null) override settings in context object,
except for attributes, tags, and additional\_tag\_map, which are merged. | `any` |
{
"additional_tag_map": {},
"attributes": [],
"delimiter": null,
"descriptor_formats": {},
"enabled": true,
"environment": null,
"id_length_limit": null,
"label_key_case": null,
"label_order": [],
"label_value_case": null,
"labels_as_tags": [
"unset"
],
"name": null,
"namespace": null,
"regex_replace_chars": null,
"stage": null,
"tags": {},
"tenant": null
}
| no | +| [datadog\_monitor\_context\_tags](#input\_datadog\_monitor\_context\_tags) | List of context tags to add to each monitor | `set(string)` |
[
"namespace",
"tenant",
"environment",
"stage"
]
| no | +| [datadog\_monitor\_context\_tags\_enabled](#input\_datadog\_monitor\_context\_tags\_enabled) | Whether to add context tags to each monitor | `bool` | `true` | no | +| [datadog\_monitor\_globals](#input\_datadog\_monitor\_globals) | Global parameters to add to each monitor | `any` | `{}` | no | +| [datadog\_monitors\_config\_parameters](#input\_datadog\_monitors\_config\_parameters) | Map of parameters to Datadog monitor configurations | `map(any)` | `{}` | no | +| [delimiter](#input\_delimiter) | Delimiter to be used between ID elements.
Defaults to `-` (hyphen). Set to `""` to use no delimiter at all. | `string` | `null` | no | +| [descriptor\_formats](#input\_descriptor\_formats) | Describe additional descriptors to be output in the `descriptors` output map.
Map of maps. Keys are names of descriptors. Values are maps of the form
`{
format = string
labels = list(string)
}`
(Type is `any` so the map values can later be enhanced to provide additional options.)
`format` is a Terraform format string to be passed to the `format()` function.
`labels` is a list of labels, in order, to pass to `format()` function.
Label values will be normalized before being passed to `format()` so they will be
identical to how they appear in `id`.
Default is `{}` (`descriptors` output will be empty). | `any` | `{}` | no | +| [enabled](#input\_enabled) | Set to false to prevent the module from creating any resources | `bool` | `null` | no | +| [environment](#input\_environment) | ID element. Usually used for region e.g. 'uw2', 'us-west-2', OR role 'prod', 'staging', 'dev', 'UAT' | `string` | `null` | no | +| [id\_length\_limit](#input\_id\_length\_limit) | Limit `id` to this many characters (minimum 6).
Set to `0` for unlimited length.
Set to `null` for keep the existing setting, which defaults to `0`.
Does not affect `id_full`. | `number` | `null` | no | +| [label\_key\_case](#input\_label\_key\_case) | Controls the letter case of the `tags` keys (label names) for tags generated by this module.
Does not affect keys of tags passed in via the `tags` input.
Possible values: `lower`, `title`, `upper`.
Default value: `title`. | `string` | `null` | no | +| [label\_order](#input\_label\_order) | The order in which the labels (ID elements) appear in the `id`.
Defaults to ["namespace", "environment", "stage", "name", "attributes"].
You can omit any of the 6 labels ("tenant" is the 6th), but at least one must be present. | `list(string)` | `null` | no | +| [label\_value\_case](#input\_label\_value\_case) | Controls the letter case of ID elements (labels) as included in `id`,
set as tag values, and output by this module individually.
Does not affect values of tags passed in via the `tags` input.
Possible values: `lower`, `title`, `upper` and `none` (no transformation).
Set this to `title` and set `delimiter` to `""` to yield Pascal Case IDs.
Default value: `lower`. | `string` | `null` | no | +| [labels\_as\_tags](#input\_labels\_as\_tags) | Set of labels (ID elements) to include as tags in the `tags` output.
Default is to include all labels.
Tags with empty values will not be included in the `tags` output.
Set to `[]` to suppress all generated tags.
**Notes:**
The value of the `name` tag, if included, will be the `id`, not the `name`.
Unlike other `null-label` inputs, the initial setting of `labels_as_tags` cannot be
changed in later chained modules. Attempts to change it will be silently ignored. | `set(string)` |
[
"default"
]
| no | +| [local\_datadog\_monitors\_config\_paths](#input\_local\_datadog\_monitors\_config\_paths) | List of paths to local Datadog monitor configurations | `list(string)` | `[]` | no | +| [message\_postfix](#input\_message\_postfix) | Additional information to put after each monitor message | `string` | `""` | no | +| [message\_prefix](#input\_message\_prefix) | Additional information to put before each monitor message | `string` | `""` | no | +| [name](#input\_name) | ID element. Usually the component or solution name, e.g. 'app' or 'jenkins'.
This is the only ID element not also included as a `tag`.
The "name" tag is set to the full `id` string. There is no tag with the value of the `name` input. | `string` | `null` | no | +| [namespace](#input\_namespace) | ID element. Usually an abbreviation of your organization name, e.g. 'eg' or 'cp', to help ensure generated IDs are globally unique | `string` | `null` | no | +| [regex\_replace\_chars](#input\_regex\_replace\_chars) | Terraform regular expression (regex) string.
Characters matching the regex will be removed from the ID elements.
If not set, `"/[^a-zA-Z0-9-]/"` is used to remove all characters other than hyphens, letters and digits. | `string` | `null` | no | +| [region](#input\_region) | AWS Region | `string` | n/a | yes | +| [remote\_datadog\_monitors\_base\_path](#input\_remote\_datadog\_monitors\_base\_path) | Base path to remote Datadog monitor configurations | `string` | `""` | no | +| [remote\_datadog\_monitors\_config\_paths](#input\_remote\_datadog\_monitors\_config\_paths) | List of paths to remote Datadog monitor configurations | `list(string)` | `[]` | no | +| [stage](#input\_stage) | ID element. Usually used to indicate role, e.g. 'prod', 'staging', 'source', 'build', 'test', 'deploy', 'release' | `string` | `null` | no | +| [tags](#input\_tags) | Additional tags (e.g. `{'BusinessUnit': 'XYZ'}`).
Neither the tag keys nor the tag values will be modified by this module. | `map(string)` | `{}` | no | +| [tenant](#input\_tenant) | ID element \_(Rarely used, not included by default)\_. A customer identifier, indicating who this instance of a resource is for | `string` | `null` | no | + +## Outputs + +| Name | Description | +|------|-------------| +| [datadog\_monitor\_names](#output\_datadog\_monitor\_names) | Names of the created Datadog monitors | + + + +## Related How-to Guides + +- [How to Monitor Everything with Datadog](https://docs.cloudposse.com/layers/monitoring/datadog/) + +## Component Dependencies + +- [datadog-integration](https://docs.cloudposse.com/components/library/aws/datadog-integration/) + +## References + +- [cloudposse/terraform-aws-components](https://github.com/cloudposse/terraform-aws-components/tree/main/modules/datadog-monitor) - + Cloud Posse's upstream component + +[](https://cpco.io/homepage?utm_source=github&utm_medium=readme&utm_campaign=cloudposse-terraform-components/aws-datadog-monitor&utm_content=)