@@ -67,6 +67,10 @@ opinions. Please communicate with us on [Slack](https://t.mp/slack) in the `#rub
67
67
- [ Activity Worker Shutdown] ( #activity-worker-shutdown )
68
68
- [ Activity Concurrency and Executors] ( #activity-concurrency-and-executors )
69
69
- [ Activity Testing] ( #activity-testing )
70
+ - [ Telemetry] ( #telemetry )
71
+ - [ Metrics] ( #metrics )
72
+ - [ OpenTelemetry Tracing] ( #opentelemetry-tracing )
73
+ - [ OpenTelemetry Tracing in Workflows] ( #opentelemetry-tracing-in-workflows )
70
74
- [ Ractors] ( #ractors )
71
75
- [ Platform Support] ( #platform-support )
72
76
- [ Development] ( #development )
@@ -1034,6 +1038,122 @@ it will raise the error raised in the activity.
1034
1038
The constructor of the environment has multiple keyword arguments that can be set to affect the activity context for the
1035
1039
activity.
1036
1040
1041
+ ### Telemetry
1042
+
1043
+ #### Metrics
1044
+
1045
+ Metrics can be configured on a ` Temporalio::Runtime ` . Only one runtime is expected to be created for the entire
1046
+ application and it should be created before any clients are created. For example, this configures Prometheus to export
1047
+ metrics at ` http://127.0.0.1:9000/metrics ` :
1048
+
1049
+ ``` ruby
1050
+ require ' temporalio/runtime'
1051
+
1052
+ Temporalio ::Runtime .default = Temporalio ::Runtime .new (
1053
+ telemetry: Temporalio ::Runtime ::TelemetryOptions .new (
1054
+ metrics: Temporalio ::Runtime ::MetricsOptions .new (
1055
+ prometheus: Temporalio ::Runtime ::PrometheusMetricsOptions .new (
1056
+ bind_address: ' 127.0.0.1:9000'
1057
+ )
1058
+ )
1059
+ )
1060
+ )
1061
+ ```
1062
+
1063
+ Now every client created will use this runtime. Setting the default will fail if a runtime has already been requested or
1064
+ a default already set. Technically a runtime can be created without setting the default and be set on each client via
1065
+ the ` runtime ` parameter, but this is discouraged because a runtime represents a heavy internal engine not meant to be
1066
+ created multiple times.
1067
+
1068
+ OpenTelemetry metrics can be configured instead by passing ` Temporalio::Runtime::OpenTelemetryMetricsOptions ` as the
1069
+ ` opentelemetry ` parameter to the metrics options. See API documentation for details.
1070
+
1071
+ #### OpenTelemetry Tracing
1072
+
1073
+ OpenTelemetry tracing for clients, activities, and workflows can be enabled using the
1074
+ ` Temporalio::Contrib::OpenTelemetry::TracingInterceptor ` . Specifically, when creating a client, set the interceptor like
1075
+ so:
1076
+
1077
+ ``` ruby
1078
+ require ' opentelemetry/api'
1079
+ require ' opentelemetry/sdk'
1080
+ require ' temporalio/client'
1081
+ require ' temporalio/contrib/open_telemetry'
1082
+
1083
+ # ... assumes my_otel_tracer_provider is a tracer provider created by the user
1084
+ my_tracer = my_otel_tracer_provider.tracer(' my-otel-tracer' )
1085
+
1086
+ my_client = Temporalio ::Client .connect(
1087
+ ' localhost:7233' , ' my-namespace' ,
1088
+ interceptors: [Temporalio ::Contrib ::OpenTelemetry ::TracingInterceptor .new (my_tracer)]
1089
+ )
1090
+ ```
1091
+
1092
+ Now many high-level client calls and activities/workflows on workers using this client will have spans created on that
1093
+ OpenTelemetry tracer.
1094
+
1095
+ ##### OpenTelemetry Tracing in Workflows
1096
+
1097
+ OpenTelemetry works by creating spans as necessary and in some cases serializing them to Temporal headers to be
1098
+ deserialized by workflows/activities to be set on the context. However, OpenTelemetry requires spans to be finished
1099
+ where they start, so spans cannot be resumed. This is fine for client calls and activity attempts, but Temporal
1100
+ workflows are resumable functions that may start on a different machine than they complete. Due to this, spans created
1101
+ by workflows are immediately closed since there is no way for the span to actually span machines. They are also not
1102
+ created during replay. The spans still become the proper parents of other spans if they are created.
1103
+
1104
+ Custom spans can be created inside of workflows using class methods on the
1105
+ ` Temporalio::Contrib::OpenTelemetry::Workflow ` module. For example:
1106
+
1107
+ ``` ruby
1108
+ class MyWorkflow < Temporalio ::Workflow ::Definition
1109
+ def execute
1110
+ # Sleep for a bit
1111
+ Temporalio ::Workflow .sleep (10 )
1112
+ # Run activity in span
1113
+ Temporalio ::Contrib ::OpenTelemetry ::Workflow .with_completed_span(
1114
+ ' my-span' ,
1115
+ attributes: { ' my-attr' => ' some val' }
1116
+ ) do
1117
+ # Execute an activity
1118
+ Temporalio ::Workflow .execute_activity(MyActivity , start_to_close_timeout: 10 )
1119
+ end
1120
+ end
1121
+ end
1122
+ ```
1123
+
1124
+ If this all executes on one worker (because Temporal has a concept of stickiness that caches instances), the span tree
1125
+ may look like:
1126
+
1127
+ ```
1128
+ StartWorkflow:MyWorkflow <-- created by client outbound
1129
+ RunWorkflow:MyWorkflow <-- created inside workflow on first task
1130
+ my-span <-- created inside workflow by code
1131
+ StartActivity:MyActivity <-- created inside workflow when first called
1132
+ RunActivity:MyActivity <-- created inside activity attempt 1
1133
+ CompleteWorkflow:MyWorkflow <-- created inside workflow on last task
1134
+ ```
1135
+
1136
+ However if, say, the worker crashed during the 10s sleep and the workflow was resumed (i.e. replayed) on another worker,
1137
+ the span tree may look like:
1138
+
1139
+ ```
1140
+ StartWorkflow:MyWorkflow <-- created by client outbound
1141
+ RunWorkflow:MyWorkflow <-- created by workflow inbound on first task
1142
+ my-span <-- created inside the workflow
1143
+ StartActivity:MyActivity <-- created by workflow outbound
1144
+ RunActivity:MyActivity <-- created by activity attempt 1 inbound
1145
+ CompleteWorkflow:MyWorkflow <-- created by workflow inbound on last task
1146
+ ```
1147
+
1148
+ Notice how the spans are no longer under ` RunWorkflow ` . This is because spans inside the workflow are not created on
1149
+ replay, so there is no parent on replay. But there are no orphans because we still have the overarching parent of
1150
+ ` StartWorkflow ` that was created by the client and is serialized into Temporal headers so it can always be the parent.
1151
+
1152
+ And reminder that ` StartWorkflow ` and ` RunActivity ` spans do last the length of their calls (so time to start the
1153
+ workflow and time to run the activity attempt respectively), but the other spans have no measurable time because they
1154
+ are created in workflows and closed immediately since long-lived spans cannot work for durable software that may resume
1155
+ on other machines.
1156
+
1037
1157
### Ractors
1038
1158
1039
1159
It was an original goal to have workflows actually be Ractors for deterministic state isolation and have the library
0 commit comments