Skip to content

[smtp_batch] Additional field for grouping #2586

@Lukas-Heindl

Description

@Lukas-Heindl

Hi,

I have a usecase where I'd like to send batched E-Mails to someone with collected events for which I want to use the smtp_batch output bot. Then in this case it would be nice to break down the events for one contact further into multiple E-Mails. Would it be possible to add some option such that the group-by is not only done by source.abuse_contact but also by a field determined by the bot configuration?
Eventually it would also be nice to use this field also included in the group-by in e.g. the subject, but that's just for the future.


I'm willing to submit this as a PR myself, I just wanted to create a place for input and also most important to check beforehand whether this has any chance of being merged.

So far smtp_batch inserts each event as value into redis with the bot_id + source.abuse_contact as key:

for mail in (field if isinstance(field, list) else [field]):
self.cache.redis.rpush(f"{self.key}{mail}", message.to_json())

It then later uses redis.keys() to retrieve all keys with a certain prefix (bot_id):

for mail_record in self.cache.redis.keys(f"{self.key}*")[slice(self.limit_results)]:

So the most straight forward way would be to simply extend the redis-key with the value of the field that should be included in the logical group-by. But we'd need to be careful, the retrieved redis-key (here mail_record) is still used to obtain the destination E-Mail address:

email_to = str(mail_record[len(self.key):], encoding="utf-8")

In order to solve this I see two options

  1. separate the values used to derive the redis-key with a special character like | -> in this case we'd need to make sure this character doesn't occur in any of the values -> replacing (but probably modifying the values is a bad idea) or escaping (a bit more complex) needed
  2. we could switch to not using the retrieved redis-key anymore. Instead we know all retrieved events must have the same source.abuse_contact so we could simply get the value from an arbitrary event we retrieved from redis (same thing if we later want to gain access to the other field we included in the group-by). Decoupling the redis-key from the rest of the logic also should make it possible to simply hash the redis-key which avoids all trouble with the redis-key getting much longer with more keys involved in the group-by logic. So in this case the redis-key would look like this f"{self.key}{hash(source.abuse_contact, other_field1, other_field2, ...)}"

As I mentioned above I've got two questions

  1. are you open generally for such additions to the smtp_batch bot?
  2. if so, I'm happy about any input you have regarding my sketch/draft of an implementation (I think, I'd probably go for the 2nd approach described above)

Metadata

Metadata

Assignees

Labels

featureIndicates new feature requests or new featureshelp wantedIndicates that a maintainer wants help on an issue or pull request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions