From cc3161159af40b8a35cb5db1339582a4444aecc7 Mon Sep 17 00:00:00 2001 From: LinZhihao-723 Date: Mon, 17 Feb 2025 02:07:02 -0500 Subject: [PATCH 1/8] Add readme --- README.md | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++++++- test.py | 0 2 files changed, 74 insertions(+), 1 deletion(-) create mode 100644 test.py diff --git a/README.md b/README.md index cccf26e..c675946 100644 --- a/README.md +++ b/README.md @@ -31,7 +31,76 @@ can be installed with `pip`: `python3 -m pip install --upgrade clp-logging` -## Logger handlers +## Serialize key-value pairs as a log event + +`ClpKeyValuePairStreamHandler`, introduced in version `0.0.14`, allows an application to log and +serialize key-value pair objects as log events into CLP key-value pair IR stream format. It works +similar as Python standard `logging.StreamHandler` with the following difference: +- Log events (`logging.LogRecord`) given to the handler should contain the key-value pairs as a + Python dictionary. It will be directly serialized into the kv-pair IR stream without being + formatted as a string. + +### Key-value pairs as Python dictionary + +Key-value pairs in the log event, presented as a Python dictionary, must abide by the following +rules: +- Keys must be of type `str`. +- Values must be one of the following types: + - Primitives: `int`, `float`, `str`, `bool`, or `None`. + - Arrays (`list`), where each array: + - may contain primitive values, dictionaries, or nested arrays. + - can be empty. + - Dictionaries (`dict`), where each dictionary: + - must adhere to the aforementioned rules for keys and values. + - can be empty. + +### Auto-generated kv-pairs vs. User-generated kv-pairs + +As explained [here][clp-ffi-py-kv-pair-ir-stream], each log event contains both auto-generated and +user-generated kv-pairs. In our logging handler, kv-pairs given by applications through logging APIs +will be directly serialized as user-generated kv-pairs. The log event metadata, generated by the +Python logging infrastructure or our logging handler itself, will be serialized as auto-generated +kv-pairs. The auto-generated kv-pairs have the following schema structure: +- "timestamp" (`dict`): + - "unix_millisecs" (`int`): Unix timestamp since epoch in milliseconds. + - "utc_offset_secs" (`int`): The number of seconds the timezone is ahead of (positive) or + behind (negative) UTC. +- "level" (`dict`): + - "name" (`str`): The name of the log level. + - "num" (`int`): The numeric value associated with the log level. +- "source_location" (`dict`): + - "path" (`str`): The path of the source file of the target logging statement. + - "line" (`int`): The line number in the source file of the target logging statement. + +### Example: Use `ClpKeyValuePairStreamHandler` to log Python dictionary + +```python +import logging +from pathlib import Path +from clp_logging.handlers import ClpKeyValuePairStreamHandler + +clp_handler = ClpKeyValuePairStreamHandler(open(Path("example.clp.zst"), "wb")) +logger: logging.Logger = logging.getLogger(__name__) +logger.addHandler(clp_handler) + +logger.info({ + "message": "This is an example message", + "machine_info": { + "uid": 12345, + "ip": "127.0.0.1", + }, +}) +``` + +### Read key-value pair IR streams + +We have provided the following options to read/deserialize the generated kv-pair IR streams: +- [clp-ffi-py][9]: This library provides Python APIs [Deserializer][clp-ffi-py-deserializer-doc] to + access a kv-pair IR stream. Check [here][clp-ffi-py-deserializer-example] for an example. +- clp-ffi-js: TODO + + +## Logging handlers ### CLPStreamHandler @@ -267,3 +336,7 @@ word][7]. [7]: https://docformatter.readthedocs.io/en/latest/faq.html#interaction-with-black [8]: https://beta.ruff.rs/docs/ [9]: https://github.com/y-scope/clp-ffi-py + +[clp-ffi-py-kv-pair-ir-stream]: https://github.com/y-scope/clp-ffi-py?tab=readme-ov-file#using-key-value-pair-ir-streams +[clp-ffi-py-deserializer-doc]: https://github.com/y-scope/clp-ffi-py?tab=readme-ov-file#example-code-using-deserializer-to-read-keyvaluepairlogevents-from-an-ir-stream +[clp-ffi-py-deserializer-example]: https://github.com/y-scope/clp-ffi-py?tab=readme-ov-file#example-code-using-deserializer-to-read-keyvaluepairlogevents-from-an-ir-stream \ No newline at end of file diff --git a/test.py b/test.py new file mode 100644 index 0000000..e69de29 From d0d035adbf24a2f7d5aec2f1a6cc10953b4cc9af Mon Sep 17 00:00:00 2001 From: LinZhihao-723 Date: Mon, 17 Feb 2025 03:57:31 -0500 Subject: [PATCH 2/8] Update... --- README.md | 74 +++++++++++++++++++++++++++++++++++-------------------- 1 file changed, 47 insertions(+), 27 deletions(-) diff --git a/README.md b/README.md index c675946..98a99ef 100644 --- a/README.md +++ b/README.md @@ -31,19 +31,33 @@ can be installed with `pip`: `python3 -m pip install --upgrade clp-logging` -## Serialize key-value pairs as a log event +## Logging key-value pairs with `ClpKeyValuePairStreamHandler` `ClpKeyValuePairStreamHandler`, introduced in version `0.0.14`, allows an application to log and serialize key-value pair objects as log events into CLP key-value pair IR stream format. It works -similar as Python standard `logging.StreamHandler` with the following difference: -- Log events (`logging.LogRecord`) given to the handler should contain the key-value pairs as a - Python dictionary. It will be directly serialized into the kv-pair IR stream without being - formatted as a string. +similar as Python standard `logging.StreamHandler`. The only difference is that log events +(`logging.LogRecord`) given to this handler will be key-value pairs as a Python dictionary. It will +be directly serialized into the kv-pair IR stream without being formatted as a string. + +Introduced in version 0.0.14, `ClpKeyValuePairStreamHandler` enables applications to log and +serialize key-value pair objects directly into the CLP key-value pair IR stream format using +Python's standard logging APIs. It operates similarly to Python's standard `logging.StreamHandler`, +with the following differences: +- Log events (`logging.LogRecord`) should contain the key-value pairs that a user wants to log + as a Python dictionary. + - These key-value pairs are written directly, without being formatted as a string. +- The key-value pairs will be serialized into the CLP key-value pair IR format before writing to + the stream. + +> [!NOTE] +> The current `ClpKeyValuePairStreamHandler` does not support `CLPLogLevelTimeout`. This feature +> will be added in a future release. ### Key-value pairs as Python dictionary -Key-value pairs in the log event, presented as a Python dictionary, must abide by the following +Key-value pairs in the log event, presented as Python dictionaries, must abide by the following rules: + - Keys must be of type `str`. - Values must be one of the following types: - Primitives: `int`, `float`, `str`, `bool`, or `None`. @@ -56,21 +70,25 @@ rules: ### Auto-generated kv-pairs vs. User-generated kv-pairs -As explained [here][clp-ffi-py-kv-pair-ir-stream], each log event contains both auto-generated and -user-generated kv-pairs. In our logging handler, kv-pairs given by applications through logging APIs -will be directly serialized as user-generated kv-pairs. The log event metadata, generated by the -Python logging infrastructure or our logging handler itself, will be serialized as auto-generated -kv-pairs. The auto-generated kv-pairs have the following schema structure: -- "timestamp" (`dict`): - - "unix_millisecs" (`int`): Unix timestamp since epoch in milliseconds. - - "utc_offset_secs" (`int`): The number of seconds the timezone is ahead of (positive) or - behind (negative) UTC. -- "level" (`dict`): - - "name" (`str`): The name of the log level. - - "num" (`int`): The numeric value associated with the log level. -- "source_location" (`dict`): - - "path" (`str`): The path of the source file of the target logging statement. - - "line" (`int`): The line number in the source file of the target logging statement. +As detailed [here][clp-ffi-py-kv-pair-ir-stream], each log event contains both auto-generated and +user-generated kv-pairs. + +In this logging handler, we differentiate auto/user-generated kv-pairs as follows: +- **Auto-generated kv-pairs**: The log event metadata generated by the Python logging infrastructure + or logging handler itself, structured with the following schema: + - "timestamp" (`dict`): + - "unix_millisecs" (`int`): Unix timestamp in milliseconds since epoch. + - "utc_offset_secs" (`int`): The number of seconds the timezone is ahead of (positive) or + behind (negative) UTC. + - "level" (`dict`): + - "name" (`str`): Log level name. + - "num" (`int`): The numeric value associated with the log level. + - "source_location" (`dict`): + - "path" (`str`): Source file path of the logging statement. + - "line" (`int`): Line number of the logging statement. +- **User-generated kv-pairs**: Key-value pairs provided by the application via the Python logging + API. These are passed through without modification and serialized as-is. + ### Example: Use `ClpKeyValuePairStreamHandler` to log Python dictionary @@ -94,12 +112,13 @@ logger.info({ ### Read key-value pair IR streams -We have provided the following options to read/deserialize the generated kv-pair IR streams: -- [clp-ffi-py][9]: This library provides Python APIs [Deserializer][clp-ffi-py-deserializer-doc] to - access a kv-pair IR stream. Check [here][clp-ffi-py-deserializer-example] for an example. +The following options are available for reading and deserializing kv-pair IR streams generated by +this handler: +- [clp-ffi-py][clp-ffi-py-pypi]: This library provides [Deserializer][clp-ffi-py-deserializer-doc] + to access a kv-pair IR stream in Python. See [this example][clp-ffi-py-deserializer-example] for + usage details. - clp-ffi-js: TODO - ## Logging handlers ### CLPStreamHandler @@ -337,6 +356,7 @@ word][7]. [8]: https://beta.ruff.rs/docs/ [9]: https://github.com/y-scope/clp-ffi-py -[clp-ffi-py-kv-pair-ir-stream]: https://github.com/y-scope/clp-ffi-py?tab=readme-ov-file#using-key-value-pair-ir-streams [clp-ffi-py-deserializer-doc]: https://github.com/y-scope/clp-ffi-py?tab=readme-ov-file#example-code-using-deserializer-to-read-keyvaluepairlogevents-from-an-ir-stream -[clp-ffi-py-deserializer-example]: https://github.com/y-scope/clp-ffi-py?tab=readme-ov-file#example-code-using-deserializer-to-read-keyvaluepairlogevents-from-an-ir-stream \ No newline at end of file +[clp-ffi-py-deserializer-example]: https://github.com/y-scope/clp-ffi-py?tab=readme-ov-file#example-code-using-deserializer-to-read-keyvaluepairlogevents-from-an-ir-stream +[clp-ffi-py-kv-pair-ir-stream]: https://github.com/y-scope/clp-ffi-py?tab=readme-ov-file#using-key-value-pair-ir-streams +[clp-ffi-py-pypi]: https://pypi.org/project/clp-ffi-py/ \ No newline at end of file From 9ece5341c6fb5eca684dc7af9549b91c10cd28e4 Mon Sep 17 00:00:00 2001 From: LinZhihao-723 Date: Mon, 17 Feb 2025 04:00:31 -0500 Subject: [PATCH 3/8] Update... --- README.md | 6 ------ 1 file changed, 6 deletions(-) diff --git a/README.md b/README.md index 98a99ef..4aa1d54 100644 --- a/README.md +++ b/README.md @@ -33,12 +33,6 @@ can be installed with `pip`: ## Logging key-value pairs with `ClpKeyValuePairStreamHandler` -`ClpKeyValuePairStreamHandler`, introduced in version `0.0.14`, allows an application to log and -serialize key-value pair objects as log events into CLP key-value pair IR stream format. It works -similar as Python standard `logging.StreamHandler`. The only difference is that log events -(`logging.LogRecord`) given to this handler will be key-value pairs as a Python dictionary. It will -be directly serialized into the kv-pair IR stream without being formatted as a string. - Introduced in version 0.0.14, `ClpKeyValuePairStreamHandler` enables applications to log and serialize key-value pair objects directly into the CLP key-value pair IR stream format using Python's standard logging APIs. It operates similarly to Python's standard `logging.StreamHandler`, From 33dc514bc7920a4db2c9214b4a5f269f331122c3 Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Wed, 19 Feb 2025 00:07:29 -0500 Subject: [PATCH 4/8] Refactor ClpKeyValuePairStreamHandler docs. --- README.md | 92 +++++++++++++++++++++++++++++-------------------------- 1 file changed, 49 insertions(+), 43 deletions(-) diff --git a/README.md b/README.md index 4aa1d54..38e2fe0 100644 --- a/README.md +++ b/README.md @@ -30,27 +30,31 @@ The package is hosted with pypi (https://pypi.org/project/clp-logging/), so it can be installed with `pip`: `python3 -m pip install --upgrade clp-logging` + +## Logging handlers + +### ClpKeyValuePairStreamHandler -## Logging key-value pairs with `ClpKeyValuePairStreamHandler` +⭐ *New in v0.0.14* -Introduced in version 0.0.14, `ClpKeyValuePairStreamHandler` enables applications to log and -serialize key-value pair objects directly into the CLP key-value pair IR stream format using -Python's standard logging APIs. It operates similarly to Python's standard `logging.StreamHandler`, -with the following differences: -- Log events (`logging.LogRecord`) should contain the key-value pairs that a user wants to log - as a Python dictionary. - - These key-value pairs are written directly, without being formatted as a string. -- The key-value pairs will be serialized into the CLP key-value pair IR format before writing to - the stream. +This handler enables applications to write structured log events directly into CLP's key-value pair +(kv-pair) IR stream format. The handler accepts structured log events in the form of Python +dictionaries, where each dictionary entry must abide by the requirements detailed +[below](#key-value-pair-requirements). The handler will also automatically include certain +[metadata](#automatically-generated-kv-pairs), like the log event's level, with each log event. -> [!NOTE] -> The current `ClpKeyValuePairStreamHandler` does not support `CLPLogLevelTimeout`. This feature -> will be added in a future release. +> [!NOTE] +> Since this handler accepts structured log events, it doesn't support setting a +> [Formatter][py-logging-formatter] (because the log events don't need formatting into a string). + +> [!WARNING] +> `ClpKeyValuePairStreamHandler` currently doesn't support +> [CLPLogLevelTimeout](#log-level-timeout-feature-clplogleveltimeout). This feature will be added in +> a future release. -### Key-value pairs as Python dictionary +#### Key-value pair requirements -Key-value pairs in the log event, presented as Python dictionaries, must abide by the following -rules: +`ClpKeyValuePairStreamHandler` requires kv-pairs abide by the following rules: - Keys must be of type `str`. - Values must be one of the following types: @@ -62,29 +66,31 @@ rules: - must adhere to the aforementioned rules for keys and values. - can be empty. -### Auto-generated kv-pairs vs. User-generated kv-pairs +#### Automatically generated kv-pairs -As detailed [here][clp-ffi-py-kv-pair-ir-stream], each log event contains both auto-generated and -user-generated kv-pairs. +In addition to the kv-pairs explicitly logged by the application, the handler will add kv-pairs, +like the log event's level, to each log event. We refer to the former as *user-generated* kv-pairs +and the latter as *auto-generated* kv-pairs. -In this logging handler, we differentiate auto/user-generated kv-pairs as follows: -- **Auto-generated kv-pairs**: The log event metadata generated by the Python logging infrastructure - or logging handler itself, structured with the following schema: - - "timestamp" (`dict`): - - "unix_millisecs" (`int`): Unix timestamp in milliseconds since epoch. - - "utc_offset_secs" (`int`): The number of seconds the timezone is ahead of (positive) or - behind (negative) UTC. - - "level" (`dict`): - - "name" (`str`): Log level name. - - "num" (`int`): The numeric value associated with the log level. - - "source_location" (`dict`): - - "path" (`str`): Source file path of the logging statement. - - "line" (`int`): Line number of the logging statement. -- **User-generated kv-pairs**: Key-value pairs provided by the application via the Python logging - API. These are passed through without modification and serialized as-is. +> [!NOTE] +> The kv-pair IR stream format stores auto-generated kv-pairs separately from user-generated +> kv-pairs, so users don't need to worry about key collisions with the auto-generated keys. +The handler adds the following auto-generated kv-pairs to each log event: -### Example: Use `ClpKeyValuePairStreamHandler` to log Python dictionary +| Key | Value type | Description | +|---------------------|------------|---------------------------------------------------| +| `timestamp` | `dict` | The log event's timestamp | +| - `unix_millisecs` | `int` | The timestamp as a Unix timestamp in milliseconds | +| - `utc_offset_secs` | `int` | The timestamp's UTC offset in seconds | +| `level` | `dict` | The log event's level | +| - `name` | `str` | The level's name | +| - `num` | `int` | The level's numeric value | +| `source_location` | `dict` | The source location of the logging statement | +| - `path` | `str` | The source location's path | +| - `line` | `int` | The source location's line number | + +### Example: `ClpKeyValuePairStreamHandler` ```python import logging @@ -104,16 +110,15 @@ logger.info({ }) ``` -### Read key-value pair IR streams +### Reading kv-pair IR streams The following options are available for reading and deserializing kv-pair IR streams generated by this handler: -- [clp-ffi-py][clp-ffi-py-pypi]: This library provides [Deserializer][clp-ffi-py-deserializer-doc] - to access a kv-pair IR stream in Python. See [this example][clp-ffi-py-deserializer-example] for - usage details. -- clp-ffi-js: TODO - -## Logging handlers + +- [clp-ffi-py][clp-ffi-py-pypi]: This library provides a [Deserializer][clp-ffi-py-deserializer-doc] + to access a kv-pair IR stream in Python. [This example][clp-ffi-py-deserializer-example] + illustrates its usage. +- [YScope Log Viewer][2]: This UI can be used to view kv-pair IR streams. ### CLPStreamHandler @@ -353,4 +358,5 @@ word][7]. [clp-ffi-py-deserializer-doc]: https://github.com/y-scope/clp-ffi-py?tab=readme-ov-file#example-code-using-deserializer-to-read-keyvaluepairlogevents-from-an-ir-stream [clp-ffi-py-deserializer-example]: https://github.com/y-scope/clp-ffi-py?tab=readme-ov-file#example-code-using-deserializer-to-read-keyvaluepairlogevents-from-an-ir-stream [clp-ffi-py-kv-pair-ir-stream]: https://github.com/y-scope/clp-ffi-py?tab=readme-ov-file#using-key-value-pair-ir-streams -[clp-ffi-py-pypi]: https://pypi.org/project/clp-ffi-py/ \ No newline at end of file +[clp-ffi-py-pypi]: https://pypi.org/project/clp-ffi-py/ +[py-logging-formatter]: https://docs.python.org/3/library/logging.html#logging.Formatter From 935202d13bc46948e2daec760ee49281b75fb08a Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Wed, 19 Feb 2025 00:16:18 -0500 Subject: [PATCH 5/8] Remove unused link. --- README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/README.md b/README.md index 38e2fe0..df42208 100644 --- a/README.md +++ b/README.md @@ -357,6 +357,5 @@ word][7]. [clp-ffi-py-deserializer-doc]: https://github.com/y-scope/clp-ffi-py?tab=readme-ov-file#example-code-using-deserializer-to-read-keyvaluepairlogevents-from-an-ir-stream [clp-ffi-py-deserializer-example]: https://github.com/y-scope/clp-ffi-py?tab=readme-ov-file#example-code-using-deserializer-to-read-keyvaluepairlogevents-from-an-ir-stream -[clp-ffi-py-kv-pair-ir-stream]: https://github.com/y-scope/clp-ffi-py?tab=readme-ov-file#using-key-value-pair-ir-streams [clp-ffi-py-pypi]: https://pypi.org/project/clp-ffi-py/ [py-logging-formatter]: https://docs.python.org/3/library/logging.html#logging.Formatter From 2c6d57e694f11431b92f76c61d46e65b15f71b49 Mon Sep 17 00:00:00 2001 From: LinZhihao-723 Date: Wed, 19 Feb 2025 01:00:56 -0500 Subject: [PATCH 6/8] Remove test.py file --- test.py | 0 1 file changed, 0 insertions(+), 0 deletions(-) delete mode 100644 test.py diff --git a/test.py b/test.py deleted file mode 100644 index e69de29..0000000 From 48fc5511482eceefa8140b70ddcf3bc06208b1fc Mon Sep 17 00:00:00 2001 From: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Wed, 19 Feb 2025 05:11:54 -0500 Subject: [PATCH 7/8] Apply suggestions from code review Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com> --- README.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index df42208..cbb21bf 100644 --- a/README.md +++ b/README.md @@ -41,11 +41,12 @@ This handler enables applications to write structured log events directly into C (kv-pair) IR stream format. The handler accepts structured log events in the form of Python dictionaries, where each dictionary entry must abide by the requirements detailed [below](#key-value-pair-requirements). The handler will also automatically include certain -[metadata](#automatically-generated-kv-pairs), like the log event's level, with each log event. +[metadata](#automatically-generated-kv-pairs) (e.g., the log event's level) with each log event. > [!NOTE] > Since this handler accepts structured log events, it doesn't support setting a -> [Formatter][py-logging-formatter] (because the log events don't need formatting into a string). +> [Formatter][py-logging-formatter] (because the log events don't need to be formatted into a +> string). > [!WARNING] > `ClpKeyValuePairStreamHandler` currently doesn't support @@ -81,7 +82,7 @@ The handler adds the following auto-generated kv-pairs to each log event: | Key | Value type | Description | |---------------------|------------|---------------------------------------------------| | `timestamp` | `dict` | The log event's timestamp | -| - `unix_millisecs` | `int` | The timestamp as a Unix timestamp in milliseconds | +| - `unix_millisecs` | `int` | The timestamp in milliseconds since the Unix epoch | | - `utc_offset_secs` | `int` | The timestamp's UTC offset in seconds | | `level` | `dict` | The log event's level | | - `name` | `str` | The level's name | From 083eaec91c9c7ef7ca3c9931d37c3f715bcbc558 Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Wed, 19 Feb 2025 05:12:44 -0500 Subject: [PATCH 8/8] Equalize table row width. --- README.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/README.md b/README.md index cbb21bf..9451029 100644 --- a/README.md +++ b/README.md @@ -79,17 +79,17 @@ and the latter as *auto-generated* kv-pairs. The handler adds the following auto-generated kv-pairs to each log event: -| Key | Value type | Description | -|---------------------|------------|---------------------------------------------------| -| `timestamp` | `dict` | The log event's timestamp | +| Key | Value type | Description | +|---------------------|------------|----------------------------------------------------| +| `timestamp` | `dict` | The log event's timestamp | | - `unix_millisecs` | `int` | The timestamp in milliseconds since the Unix epoch | -| - `utc_offset_secs` | `int` | The timestamp's UTC offset in seconds | -| `level` | `dict` | The log event's level | -| - `name` | `str` | The level's name | -| - `num` | `int` | The level's numeric value | -| `source_location` | `dict` | The source location of the logging statement | -| - `path` | `str` | The source location's path | -| - `line` | `int` | The source location's line number | +| - `utc_offset_secs` | `int` | The timestamp's UTC offset in seconds | +| `level` | `dict` | The log event's level | +| - `name` | `str` | The level's name | +| - `num` | `int` | The level's numeric value | +| `source_location` | `dict` | The source location of the logging statement | +| - `path` | `str` | The source location's path | +| - `line` | `int` | The source location's line number | ### Example: `ClpKeyValuePairStreamHandler`