-
Notifications
You must be signed in to change notification settings - Fork 0
Workload Configuration
Use this to define properties about source streams. Each message will be generated by sampling the respective distributions for the keys and values.
-
name: used as reference in transformations -
num_keys: number of keys used in this stream -
key_dist: distribution of key usage, eitherUniformIntegerDistributionorZipfDistribution -
key_dist_params: params supplied to the distribution class -
msg_dist: distribution of message length, currently onlyUniformIntegerDistribution -
msg_dist_params: params supplied to the distribution class -
rate: rate of messages produced per second
Each transformation specifies the following information:
-
input: if 1-to-n transformation, the name of the source/transformation that this transformation uses -
inputs: if m-to-n transformation, list of input sources/transformations that this transformation uses -
operator: the type of transformation to run, either aStringor the fully qualified class name of the operator -
params: parameters used by the operator function, described in the section below -
cpu_load: a double value between 0.0 and 1.0 representing how much load to put on the CPU (default 0.0) -
mem_load: need to define... (default 0.0) -
disk_load: need to define... (default 0.0)
1-to-1 transformation to remove messages from incoming stream
Parameters:
-
p: probability of dropping the message (0 <= p < 1)
1-to-n transformation that sends the input stream to n output streams. If ratio is 1 or unspecified, the input stream messages are copied to respective the output streams.
Parameters:
-
n: number of output streams -
ratio: ratio of output message size to input message size
1-to-1 transformation to change the size of individual messages or the rate of messages passing in a stream.
Parameters:
-
size_ratio: ratio of output message size to input message size -
rate_ratio: ratio of messages output per input message (1implies that the rate is the same)
n-to-1 transformation that simply takes messages from all the input streams and sends it to a single output stream.
Parameters: none
need to define...
These transformations use either local (in-memory) store or a remote store to persist information across messages.
Join 2 streams into a single output stream based on key-matches. The output message value is the concatenation of the two messages.
Parameters:
-
ttl: time-to-live for received messages that have not yet joined with another message. Value should be a string formatted as"<integer><unit>"whereunitis one of[ms, s, m]. Examples:"2s","350ms", etc.
Various types of window-based operations Window types:
- tumbling: non-overlapping, fixed size and contiguous intervals
-
session: messages are grouped into a session defined by a
sessionGap. Session is closed and results are emitted if no new messages arrive for the window for the gap duration. - sliding: currently unsupported
Parameters:
-
type: eithertumblingorsession -
duration: the window duration. Value should be a string formatted as"<integer><unit>"whereunitis one of[ms, s, m]. Examples:"2s","350ms", etc. -
is_keyed: boolean, currently defaults totrue -
trigger: currently unsupported -
accumulation: currently unsupported

{
"sources": {
"s1": {
"num_keys": 10,
"key_dist": "org.apache.commons.math4.distribution.ZipfDistribution",
"key_dist_params": {
"exponent": 1.2
},
"msg_dist": "org.apache.commons.math4.distribution.UniformIntegerDistribution",
"msg_dist_params": {
"lower": 100,
"upper": 1000
},
"rate": 10
}
},
"transformations": {
"t1": {
"operator": "filter",
"input": "s1",
"params": {
"p": 0.5
}
},
"t2": {
"operator": "split",
"input": "t1",
"params": {
"n": 2,
"ratio": 0.5
},
"outputs": ["t2__1", "t2__2"]
},
"t3": {
"operator": "filter",
"input": "t2__1",
"params": {
"p": 0.75
}
},
"t4": {
"operator": "modify",
"input": "t2__2",
"params": {
"rate_ratio": 0.5
}
}
},
"sinks": [
"t3", "t4"
]
}