fix: Fix issues related to automatic logging set up in Haystack#11635
fix: Fix issues related to automatic logging set up in Haystack#11635sjrl wants to merge 2 commits into
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
||||||||||||||||||||||||||||||||||||||||||||||||||||||
mpangrazzi
left a comment
There was a problem hiding this comment.
Looks good! I've left two minor comments.
| for h in target_logger.handlers | ||
| if not (isinstance(h, logging.StreamHandler) and h.name == "HaystackLoggingHandler") | ||
| ] | ||
| target_logger.handlers = [handler, *old_handlers] |
There was a problem hiding this comment.
This may duplicate Haystack logs when users opt back into root formatting with configure_logging(logger_name="").
import haystack already installs HaystackLoggingHandler on the haystack namespaces then adds another handler to root but leaves the namespace handlers in place. Since propagation remains enabled, haystack.* records may be emitted twice.
Not 100% sure that this happens, would try to verify with e.g.:
import logging
import haystack
haystack.logging.configure_logging(use_json=True, logger_name="")
logging.getLogger("haystack.demo").warning("dup check")
And maybe extend this test to cover also the logger_name="" case.
| @@ -352,11 +390,16 @@ def configure_logging(use_json: bool | None = None) -> None: | |||
| shared_processors.append(correlate_logs_with_traces) | |||
|
|
|||
| structlog.configure( | |||
There was a problem hiding this comment.
This structlog.configure(...) call is (still?) process-global.
If Haystack is imported before the host app configures structlog, any non-Haystack structlog logger used while Haystack’s config is active will still run through Haystack’s processor chain (regardless of logger name).
Maybe we could move structlog.configure(...) into configure_logging(...) calls.
Related Issues
Proposed Changes:
haystack.logging.configure_loggingnow attaches its formatting handler to Haystack's own logger namespaces(
haystack,haystack_integrationsandhaystack_experimental) instead of the root logger. This makesHaystack a better-behaved library when it runs next to other services in the same process: the log records of the
host application and other libraries are no longer reformatted or duplicated by Haystack. To restore the previous
behavior of formatting every log record in the process, call
configure_logging(logger_name="").haystack.logging.configure_loggingno longer freezes the log level at import time. Previously the level wascaptured once when the function ran (at
import haystack), so a log level set later by the application wasignored for
structlog-native loggers. The effective level is now read on every log call.haystack.utils.requests_utilsbeing named after the module's file path instead ofhaystack.utils.requests_utils, which kept its records outside thehaystacklogger namespace.structlogconfiguration that the host application already set up.The import-time call to
configure_loggingnow skips configuration whenstructlogis already configured.haystack.logging.getLoggeris now idempotent. Previously, calling it more than once for the same logger namewrapped the already-wrapped logger methods again, which caused the log message to be run through
str.formatonce per call. As a result a field value containing
{...}could be re-interpolated and pull in the value ofanother field. Each logger is now patched only once.
findCallerused to determine the source of a log entry no longer prints to stdout and no longermasks errors with a misleading
NameError, matching structlog's ownfindCallerimplementation.features:
haystack.logging.configure_logginggained three parameters:logger_nameto choose which logger(s) theformatting handler is attached to,
propagateto control whether Haystack's loggers propagate their records toancestor loggers, and
forceto control whether an existingstructlogconfiguration is replaced. Setpropagate=Falseto let Haystack fully own the output of its own logs and avoid duplicate log lines when thehost application also configures the root logger.
How did you test it?
Notes for the reviewer
Checklist
fix:,feat:,build:,chore:,ci:,docs:,style:,refactor:,perf:,test:and added!in case the PR includes breaking changes.