-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(logging): Distinguish logs from different models #1302
feat(logging): Distinguish logs from different models #1302
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @vtaskow !! I've added a small comment, but this should be ready to go. 🚀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewing after the fact, but leaving a few suggestions that I think are worth discussing.
These are nice, simple changes - nice work @vtaskow 😄
I'm personally a big fan of structured, contextual logging, although I normally prefer it to be explicit--passing a configured logger into the appropriate logic. That's purely personal preference and a thought for offline discussion, rather than a comment on the changes in this PR.
|
||
model_name_var: ContextVar[str] = ContextVar("model_name") | ||
model_version_var: ContextVar[str] = ContextVar("model_version") | ||
from ..context import model_name_var, model_version_var | ||
|
||
SELDON_MODEL_NAME_LABEL = "model_name" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💭 Given that the name and version are used in both logging and metrics, it'd be nice to have constants (as far as Python allows) to ensure the same labels are available in both contexts. If the context vars are the source of truth, that be var.name
and var.get()
, which I think is pretty straightforward.
A logging formatter that uses context variables to inject | ||
the model name and version in the log message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the general pattern for Python docstrings is that the summary should be a single line, even for multi-line docstrings. E.g.
"""Log formatter incorporating model details, e.g. name and version."""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't aware of this PEP rule. Good spot, I will refactor 👍
if not name: | ||
return "" | ||
model_fmt = f"{name}:{version}" if version else name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙃 Personally I find this slightly hard to read, even if it's concise. Having 3 explicit cases would be much more immediately obvious to me. Assuming Python 3.10, this could use match-case:
def fmt(name, version):
match name, version:
case n, _ if not n:
return ""
case n, v if not v:
return f"[{n}]"
case n, v:
return f"[{n}:{v}]"
This also only needs 1 format string per case--minor performance improvement and slightly easier to understand what's output.
For older Python versions, this could be:
if not name:
return ""
elif not version:
return f"[{name}]"
else:
return f"[{n}:{v}]"
This is arguably the most legible and broad definition due to the concise use of Boolean contexts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, the pattern matching looks very elegant. Thanks, Alex! I will refactor it
|
||
|
||
@pytest.mark.parametrize( | ||
"name, version, expected_fmt", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔧 The implementation of the model logger allows the model name to be empty. This case should either be captured by a test case or it shouldn't be permitted because it indicates something has gone wrong (or hasn't been wired up correctly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! I've missed that. Will fix in the incoming PR.
I was just testing using master branch (1.4.0.dev3 or similar) and found that the model name does work in logs for loading but not on inference. This is because the REST and gRPC servers inherit their own loggers but that is not configured using the |
Why
Inspecting logs from different ML models and knowing which log comes from which model is difficult. Currently, all such logs are grouped under the same logger -
mlserver
.How
Generally, there are several ways to accomplish this behaviour(by using built-in adapter, formatter, filter). Using a custom logger formatter with context variables(used in metrics package with contextmanager) is a simple solution to this problem. Furthermore, external libraries or custom config files are not needed.
Outcome
In context scope, where the model name and version are set by the context vars, all logs will contain this format: [mlserver][model_name:model_version] LOGLEVEL MSG. E.g.
2023-07-17 14:23:10,102 [mlserver][mushroom-xgboost:v0.1.0] INFO - Loaded model 'mushroom-xgboost' succesfully.
Or if the model_version is not present:
2023-07-17 14:23:10,102 [mlserver][mushroom-xgboost] INFO - Loaded model 'mushroom-xgboost' succesfully.
Furthermore, enabling the model context for un/re/loading models and predicting
Resolves #602