Skip to content

Update Output size #363

@DanteNiewenhuis

Description

@DanteNiewenhuis

Currently, the output parquet files can become very large.

This is primarily because of the many strings used (task_name, host_name etc).

This can be fixed by instead using Integers as identifier, but a mapping file that provides a mapping from id to name.

Metadata

Metadata

Labels

kind/perfPerformance optimization

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions