Skip to content

Commit a26decf

Browse files
fix(logging): refactor to namespaced stdlib logging for production integration
Refactored logging system from Loguru to Python's standard library logging with a namespaced logger under 'contextgem'. This eliminates global state pollution and enables production-ready integration with host applications.
1 parent 4a42ddd commit a26decf

File tree

12 files changed

+527
-434
lines changed

12 files changed

+527
-434
lines changed

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,10 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
55

66
- **Refactor**: Code reorganization that doesn't change functionality but improves structure or maintainability
77

8+
## [0.19.2](https://github.com/shcherbak-ai/contextgem/releases/tag/v0.19.2) - 2025-09-30
9+
### Fixed
10+
- Logging system refactored to use Python's standard library logging with namespaced logger (`contextgem`) for production-ready integration. Eliminates global state pollution, prevents conflicts with host application logging, and enables independent configuration. Replaced Loguru with colorlog for colored output.
11+
812
## [0.19.1](https://github.com/shcherbak-ai/contextgem/releases/tag/v0.19.1) - 2025-09-19
913
### Changed
1014
- Enhanced documentation with pretty URLs (removing `.html` extensions) for improved SEO and user experience

NOTICE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,11 +25,11 @@ This software includes the following third-party components:
2525

2626
Core Dependencies:
2727
- aiolimiter: Rate limiting for asynchronous operations
28+
- colorlog: Colored logging formatter
2829
- fastjsonschema: Fast JSON schema validator
2930
- genai-prices: LLM pricing data and utilities (by Pydantic) to automatically estimate costs
3031
- Jinja2: Templating engine
3132
- litellm: LLM interface library (this software uses only MIT-licensed portions of LiteLLM and does not utilize any components from the enterprise/ directory)
32-
- loguru: Logging utility
3333
- lxml: High-performance XML processing library
3434
- pillow: Image processing library for local model image handling
3535
- pydantic: Data validation

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -488,11 +488,11 @@ This project is automatically scanned for security vulnerabilities using multipl
488488
ContextGem relies on these excellent open-source packages:
489489

490490
- [aiolimiter](https://github.com/mjpieters/aiolimiter): Powerful rate limiting for async operations
491+
- [colorlog](https://github.com/borntyping/python-colorlog): Colored formatter for Python's logging module
491492
- [fastjsonschema](https://github.com/horejsek/python-fastjsonschema): Ultra-fast JSON schema validation
492493
- [genai-prices](https://github.com/pydantic/genai-prices): LLM pricing data and utilities (by Pydantic) to automatically estimate costs
493494
- [Jinja2](https://github.com/pallets/jinja): Fast, expressive, extensible templating engine used for prompt rendering
494495
- [litellm](https://github.com/BerriAI/litellm): Unified interface to multiple LLM providers with seamless provider switching
495-
- [loguru](https://github.com/Delgan/loguru): Simple yet powerful logging that enhances debugging and observability
496496
- [lxml](https://github.com/lxml/lxml): High-performance XML processing library for parsing DOCX document structure
497497
- [pillow](https://github.com/python-pillow/Pillow): Image processing library for local model image handling
498498
- [pydantic](https://github.com/pydantic/pydantic): The gold standard for data validation

contextgem/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
ContextGem - Effortless LLM extraction from documents
2121
"""
2222

23-
__version__ = "0.19.1"
23+
__version__ = "0.19.2"
2424
__author__ = "Shcherbak AI AS"
2525

2626
from contextgem.public import (

contextgem/internal/loggers.py

Lines changed: 188 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -19,61 +19,151 @@
1919
"""
2020
Module providing a customized logging configuration for the ContextGem framework.
2121
22-
This module configures a Loguru-based logging system with environment variable controls
23-
for log level and enabling/disabling logging. It includes a dedicated stream wrapper
24-
for consistent log formatting.
22+
This module configures a standard library logging system with environment variable controls
23+
for log level and enabling/disabling logging. It uses a namespaced logger ('contextgem').
2524
"""
2625

2726
from __future__ import annotations
2827

28+
import logging
2929
import os
3030
import sys
31+
import threading
32+
from typing import Protocol
3133

32-
from loguru import logger
34+
import colorlog
3335

3436
from contextgem.internal.suppressions import _install_litellm_noise_filters
3537

3638

39+
class _LoggerProtocol(Protocol):
40+
"""
41+
Protocol defining the logger interface with custom methods.
42+
43+
This Protocol is used purely for type checking to inform type checkers
44+
(e.g. Pyright) about the custom .trace() and .success() methods that
45+
are dynamically added to logging.Logger at runtime.
46+
"""
47+
48+
propagate: bool
49+
handlers: list[logging.Handler]
50+
51+
def trace(self, message: str, *args, **kwargs) -> None:
52+
"""
53+
Log with TRACE level (below DEBUG).
54+
"""
55+
...
56+
57+
def debug(self, message: str, *args, **kwargs) -> None:
58+
"""
59+
Log with DEBUG level.
60+
"""
61+
...
62+
63+
def info(self, message: str, *args, **kwargs) -> None:
64+
"""
65+
Log with INFO level.
66+
"""
67+
...
68+
69+
def success(self, message: str, *args, **kwargs) -> None:
70+
"""
71+
Log with SUCCESS level (between INFO and WARNING).
72+
"""
73+
...
74+
75+
def warning(self, message: str, *args, **kwargs) -> None:
76+
"""
77+
Log with WARNING level.
78+
"""
79+
...
80+
81+
def error(self, message: str, *args, **kwargs) -> None:
82+
"""
83+
Log with ERROR level.
84+
"""
85+
...
86+
87+
def critical(self, message: str, *args, **kwargs) -> None:
88+
"""
89+
Log with CRITICAL level.
90+
"""
91+
...
92+
93+
def addHandler(self, handler: logging.Handler) -> None: # noqa: N802
94+
"""
95+
Adds a handler to the logger.
96+
"""
97+
...
98+
99+
def removeHandler(self, handler: logging.Handler) -> None: # noqa: N802
100+
"""
101+
Removes a handler from the logger.
102+
"""
103+
...
104+
105+
def setLevel(self, level: int) -> None: # noqa: N802
106+
"""
107+
Sets the logging level.
108+
"""
109+
...
110+
111+
37112
DEFAULT_LOGGER_LEVEL = "INFO"
38113

39114
# Dynamically control logging state with env vars
40115
LOGGER_LEVEL_ENV_VAR_NAME = "CONTEXTGEM_LOGGER_LEVEL"
41116

117+
# Add custom levels
118+
TRACE_LEVEL_NUM = 5 # Below DEBUG (10)
119+
SUCCESS_LEVEL_NUM = 25 # Between INFO (20) and WARNING (30)
120+
logging.addLevelName(TRACE_LEVEL_NUM, "TRACE")
121+
logging.addLevelName(SUCCESS_LEVEL_NUM, "SUCCESS")
122+
42123

43-
class _DedicatedStream:
124+
def _trace(self, message, *args, **kwargs):
44125
"""
45-
A dedicated stream wrapper for formatting and directing messages to
46-
a base stream.
126+
Logs a message with severity 'TRACE' on this logger.
127+
128+
This is a custom level below DEBUG.
47129
"""
130+
if self.isEnabledFor(TRACE_LEVEL_NUM):
131+
self._log(TRACE_LEVEL_NUM, message, args, **kwargs)
48132

49-
def __init__(self, base):
50-
self.base = base
51133

52-
def write(self, message):
53-
"""
54-
Writes a message to the base stream with contextgem prefix.
134+
def _success(self, message, *args, **kwargs):
135+
"""
136+
Logs a message with severity 'SUCCESS' on this logger.
55137
56-
:param message: The message to write to the stream.
57-
:type message: str
58-
"""
59-
# You can add a prefix or other formatting if you wish
60-
self.base.write(f"[contextgem] {message}")
138+
This is a custom level between INFO and WARNING.
139+
"""
140+
if self.isEnabledFor(SUCCESS_LEVEL_NUM):
141+
self._log(SUCCESS_LEVEL_NUM, message, args, **kwargs)
61142

62-
def flush(self):
63-
"""
64-
Flushes the base stream to ensure all output is written.
65-
"""
66-
self.base.flush()
67143

144+
# Add custom methods to Logger class
145+
logging.Logger.trace = _trace # type: ignore[attr-defined]
146+
logging.Logger.success = _success # type: ignore[attr-defined]
147+
148+
# Create a namespaced logger for ContextGem
149+
logger: _LoggerProtocol = logging.getLogger("contextgem") # type: ignore[assignment]
150+
151+
# Add NullHandler by default
152+
logger.addHandler(logging.NullHandler())
68153

69-
dedicated_stream = _DedicatedStream(sys.stdout)
154+
# Track our handler to avoid duplicates
155+
_contextgem_handler: logging.Handler | None = None
156+
_handler_lock = threading.Lock()
70157

71158

72-
# Helper to read environment config at import time
73159
def _read_env_vars() -> tuple[bool, str]:
74160
"""
75161
Returns the (disabled_status, level) read from environment variables.
162+
163+
:return: Tuple of (should_disable, level_string)
164+
:rtype: tuple[bool, str]
76165
"""
166+
77167
# Default to DEFAULT_LOGGER_LEVEL if no variable is set or invalid
78168
level_str = os.getenv(LOGGER_LEVEL_ENV_VAR_NAME, DEFAULT_LOGGER_LEVEL).upper()
79169
valid_levels = [
@@ -94,55 +184,87 @@ def _read_env_vars() -> tuple[bool, str]:
94184
return disable_logger, level_str
95185

96186

97-
def _apply_color_scheme():
187+
def _get_colored_formatter() -> logging.Formatter:
98188
"""
99-
Defines custom colors for each log level (mimicking colorlog style)
189+
Creates a colored formatter using colorlog with millisecond precision.
190+
191+
:return: A logging formatter with color support and milliseconds
192+
:rtype: logging.Formatter
100193
"""
101-
logger.level("DEBUG", color="<cyan>")
102-
logger.level("INFO", color="<blue>")
103-
logger.level("SUCCESS", color="<green>")
104-
logger.level("WARNING", color="<yellow>")
105-
logger.level("ERROR", color="<red>")
106-
logger.level("CRITICAL", color="<red><bold>")
194+
195+
# Use colorlog for colored output with milliseconds
196+
return colorlog.ColoredFormatter(
197+
"[contextgem] %(log_color)s%(asctime)s.%(msecs)03d%(reset)s | "
198+
"%(log_color)s%(levelname)-7s%(reset)s | %(message)s",
199+
datefmt="%Y-%m-%d %H:%M:%S",
200+
reset=True,
201+
log_colors={
202+
"TRACE": "dim",
203+
"DEBUG": "cyan",
204+
"INFO": "blue",
205+
"SUCCESS": "green",
206+
"WARNING": "yellow",
207+
"ERROR": "red",
208+
"CRITICAL": "red,bold",
209+
},
210+
style="%",
211+
)
107212

108213

109-
# Main configuration function
110214
def _configure_logger_from_env():
111215
"""
112-
Configures the Loguru logger based on environment variables.
216+
Configures the contextgem logger based on environment variables.
113217
This can be called at import time (once) or re-called any time.
114218
115-
(Loguru does not require `getLogger(name)`; we just import `logger` and use it.)
116-
"""
117-
disable_logger, level_str = _read_env_vars()
118-
119-
# Remove default handlers
120-
logger.remove()
121-
122-
# If logging is disabled (OFF level), just disable and don't add any handlers
123-
if disable_logger:
124-
logger.disable("")
125-
return
126-
127-
# Enable logging and add handler
128-
logger.enable("")
129-
130-
# Apply custom level color scheme
131-
_apply_color_scheme()
132-
133-
logger.add(
134-
dedicated_stream,
135-
level=level_str,
136-
colorize=True,
137-
format=(
138-
"<white>{time:YYYY-MM-DD HH:mm:ss.SSS}</white> | "
139-
"<level>{level: <7}</level> | "
140-
"{message}"
141-
),
142-
)
143-
144-
# Install filters to suppress noisy third-party dependency logs
145-
_install_litellm_noise_filters()
146-
219+
This function only affects the 'contextgem' logger and is thread-safe.
147220
221+
:return: None
222+
:rtype: None
223+
"""
224+
global _contextgem_handler
225+
226+
with _handler_lock:
227+
disable_logger, level_str = _read_env_vars()
228+
229+
# Remove our previous handler if it exists
230+
if _contextgem_handler is not None:
231+
logger.removeHandler(_contextgem_handler)
232+
_contextgem_handler = None
233+
234+
# If logging is disabled (OFF level), remove all handlers except NullHandler
235+
if disable_logger:
236+
# Remove all non-NullHandler handlers
237+
for handler in logger.handlers[:]:
238+
if not isinstance(handler, logging.NullHandler):
239+
logger.removeHandler(handler)
240+
# Set level high to ensure nothing gets through
241+
logger.setLevel(logging.CRITICAL + 1)
242+
# Don't propagate to avoid any output
243+
logger.propagate = False
244+
return
245+
246+
# Enable logging and add handler
247+
# Handle custom levels specially
248+
if level_str == "TRACE":
249+
logger.setLevel(TRACE_LEVEL_NUM)
250+
elif level_str == "SUCCESS":
251+
logger.setLevel(SUCCESS_LEVEL_NUM)
252+
else:
253+
logger.setLevel(getattr(logging, level_str))
254+
255+
# Don't propagate - we manage our own output
256+
# This prevents duplicate logs if the root logger also has handlers
257+
logger.propagate = False
258+
259+
# Create and configure handler
260+
handler = logging.StreamHandler(sys.stdout)
261+
handler.setFormatter(_get_colored_formatter())
262+
logger.addHandler(handler)
263+
_contextgem_handler = handler
264+
265+
# Install filters to suppress noisy third-party dependency logs
266+
_install_litellm_noise_filters()
267+
268+
269+
# Configure on import
148270
_configure_logger_from_env()

dev/readme.template.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -281,11 +281,11 @@ This project is automatically scanned for security vulnerabilities using multipl
281281
ContextGem relies on these excellent open-source packages:
282282

283283
- [aiolimiter](https://github.com/mjpieters/aiolimiter): Powerful rate limiting for async operations
284+
- [colorlog](https://github.com/borntyping/python-colorlog): Colored formatter for Python's logging module
284285
- [fastjsonschema](https://github.com/horejsek/python-fastjsonschema): Ultra-fast JSON schema validation
285286
- [genai-prices](https://github.com/pydantic/genai-prices): LLM pricing data and utilities (by Pydantic) to automatically estimate costs
286287
- [Jinja2](https://github.com/pallets/jinja): Fast, expressive, extensible templating engine used for prompt rendering
287288
- [litellm](https://github.com/BerriAI/litellm): Unified interface to multiple LLM providers with seamless provider switching
288-
- [loguru](https://github.com/Delgan/loguru): Simple yet powerful logging that enhances debugging and observability
289289
- [lxml](https://github.com/lxml/lxml): High-performance XML processing library for parsing DOCX document structure
290290
- [pillow](https://github.com/python-pillow/Pillow): Image processing library for local model image handling
291291
- [pydantic](https://github.com/pydantic/pydantic): The gold standard for data validation

docs/source/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
project = "ContextGem"
2626
copyright = "2025, Shcherbak AI AS"
2727
author = "Sergii Shcherbak"
28-
release = "0.19.1"
28+
release = "0.19.2"
2929

3030

3131
# Add path to the package

docs/source/llms.txt

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10408,7 +10408,8 @@ Logging Configuration
1040810408

1040910409
ContextGem provides comprehensive logging to help you monitor and
1041010410
debug the extraction process. You can control logging behavior using
10411-
environment variables.
10411+
environment variables. ContextGem uses a **namespaced logger** under
10412+
the name "contextgem".
1041210413

1041310414

1041410415
⚙️ Environment Variables
@@ -10510,7 +10511,7 @@ ContextGem logs use the following format:
1051010511

1051110512
Each log entry includes:
1051210513

10513-
* Timestamp
10514+
* Timestamp (with milliseconds)
1051410515

1051510516
* Log level
1051610517

0 commit comments

Comments
 (0)