Skip to content

generic csv parser: timezone correction #461

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
sebix opened this issue Mar 5, 2016 · 5 comments
Open

generic csv parser: timezone correction #461

sebix opened this issue Mar 5, 2016 · 5 comments
Labels
component: bots feature Indicates new feature requests or new features

Comments

@sebix
Copy link
Member

sebix commented Mar 5, 2016

Additional option for generic csv parser: timezone correction.

The timezone offset is often not given in the time-column, so it should be defined manually.

Possible configuration format: +10:00, -8 etc. No abbreviations, they are not unique in most cases.

@sebix sebix added feature Indicates new feature requests or new features component: bots labels Mar 5, 2016
@sebix sebix added this to the Release 2 - v1.1 milestone Mar 5, 2016
@ghost ghost modified the milestones: 1.1.0, 1.2.0 Jun 28, 2018
@ghost ghost modified the milestones: 1.2.0, 2.0.0, 2.1.0 Apr 9, 2019
@kalyparker
Copy link
Contributor

I just discovered the generic parser choose to convert the date by himself...
My source is in UTC, my server is in local time (+2 or +1 depending of the season...)

Question: Is there a reason why this information is change? I don't want it. UTC is fine for the storage.

And remark on this topic, use format like +10:00, -8 does not work in most of European country when there is a change time summer/winter.

@ghost
Copy link

ghost commented Jul 23, 2019

Yeah, depends on your local time zone. Does setting the env variable TZ=UTC help?

And remark on this topic, use format like +10:00, -8 does not work in most of European country when there is a change time summer/winter.

it should also be possible to use the names like CET. But if your data's timezone depends on time, that's tricky anyway and IMO that's your data's (or the creator's) fault. What should intelmq guess then? Is 2019-03-31T02:30:00 CET or CEST?

@kalyparker
Copy link
Contributor

Yes it is better with the env variable. I realize that all data collected until now are not in UTC :(
A parameter in intelmq conf should be great for avoiding it, don't you think ?
Let's the user decide ;)

Anyway, if I understand this issue, we simply add another parameter compatible with tzinfo.
And for being sure, parameter is for knowing the format of the source, not destination, right ?
Destination is determined by the env variable.

@kalyparker
Copy link
Contributor

After some tests,
It appear the timezone is fine.
My problem comes from this line:

if key in ["time.source", "time.destination"]:
as I had another field using Datetime in the harmonization file.

I replace by

if key.startswith("time."):

Back to the subject, I try to add a parameter for using tzstr and/or pytz. It is such a nightmare.
Which one do you want?
And do you want to use this parameter on 4 options (timestamp, windows_nt, epoch_millis and None) ?
Any idea where to start?

@ghost
Copy link

ghost commented Jul 30, 2019

Yes, timezones are a real nightmare - especially in python

Is there any circumstance where input data lacking timezone information is not rejected by intelmq? It should always actually (except for parsers where tz information is given as fallback value). If yes, could you please describe this?

@ghost ghost modified the milestones: 2.1.0, 2.2.0 Oct 25, 2019
@ghost ghost removed this from the 2.2.0 milestone Jun 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: bots feature Indicates new feature requests or new features
Projects
None yet
Development

No branches or pull requests

2 participants