-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial structured logging changes #5954
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a big PR. I had some questions in several places.
|
||
# apply our schema tests to every log event from the previous step | ||
# skips any output that isn't valid json | ||
- uses: actions-rs/cargo@v1 | ||
with: | ||
command: run | ||
args: --manifest-path test/interop/log_parsing/Cargo.toml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we going to convert to somehow check against our proto definitions here in the future? This action is now doing the same things as our main workflow for integrations tests except it's changing the log format to be json without validating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the rust piece because a lot of it would have to be refactored and in my opinion it doesn't help that much. If someone disagrees or wants to do that re-implementation work, that's fine. I would personally vote for doing that in Python, so it can more easily be run locally.
The test as it is now does check that everything can be serialized and did catch a case where merging main had removed some updates to a fire_event call.
There's another ticket to implement log_format of protobuf and round trip the formatted logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nathaniel-may do you remember why this was written in Rust?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're consuming the logs from an "outside source" so it could be written in anything. I chose Rust because you can generate most of the code we'd be using here from a type definition. Plus Python's serialization-deserialization story is pretty abysmal so I didn't consider that a good option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the way I see it, we only have two ways to verify that our structured logs conform to anything even remotely reasonable for our users to consume:
- mypy in base_types.py but mypy is limited on what it can check because it has to be defined at the type level (e.g. - it can't catch positive vs negative integers). These type hints are composed through inheritance and aren't particularly easy to use as a schema.
- These interop tests written in Rust. There are lots of values we can't check at the type level so I think it's good we check values since it's a programmatic interface. There are certainly alternatives to this approach if we want to pin down these invariants another way.
I will concede that I didn't add very many value checks so @gshank is correct to call out it's limited usefulness in its current state. The original idea is that we would add more to these tests over time which we haven't really done. If Rust isn't the right choice for the team I'm perfectly fine to see this rewritten, but I don't think it should be removed without being replaced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will open a ticket to look at restoring some of this testing.
# ======================================================= | ||
# A - Pre-project loading | ||
# ======================================================= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
|
||
|
||
# TODO: can we combine this with ConnectionLeftOpen? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was originally built with the model of events never being reused. That's why ConnectionClosed
and ConnectionClosed2
exist currently. Do we need to rethink this model with proto events?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I do think that in some cases we might have a logically equivalent location that's in different places in the code, but we want to use the same event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking on this more, when we start reusing the events the value of the codes gets lost. We won't know with certainty that the log that came out for code A1234 was generated in file foo.py
on line 127. The value there feels higher to me than not having to maintain a few events that are similar, if not the same.
core/dbt/events/README.md
Outdated
## Compiling types.proto | ||
|
||
In the core/dbt/events directory: ```protoc --python_betterproto_out . types.proto``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a note about when we need to compile the types.proto?
core/setup.py
Outdated
@@ -48,6 +48,7 @@ | |||
install_requires=[ | |||
"Jinja2==3.1.2", | |||
"agate>=1.6,<1.6.4", | |||
"betterproto>=1.2.5", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a standard on how we're pinning packages? Most (but not all) the time we seem to either pin to an exact version or a range with lower and upper bounds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if there's a standard. Do you think we should put == instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our policies for these things are WIP (here), but basically: IMO we shouldn't tightly pin (==
) if we don't need. (It makes sense to do that in "locked" requirements files that guarantee a working install, but not for setup requirements.) And the reason to add an upper bound is just to avoid breaking changes that could come in v2.0.0. Have we tested with older versions of betterproto v1, before 1.2.5? If it can be as simple as betterproto>1,<2
, amazing. If we're using capabilities that are new since v1.0, then betterproto>=1.2.5,<2
should do the trick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
v1.2.5 was released May 2020. Looks like they've been actively working on a beta 2.0 since then. Since the release is older, I would vote to pin it. Seems unlikely they will release any new minor/patch releases for v1. I can't find any timelines for the beta but have an open question in their slack about it. @gshank were you able to find anything about timelines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I recall the betterproto 2.0 beta has some great features but definitely some breaking/unexpected behavior. Not locking to the major version seems risky
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also there can be unexpected behavior if we compile with a different betterproto version than we use at runtime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pinned to 1.2.5
dev-requirements.txt
Outdated
@@ -1,3 +1,4 @@ | |||
betterproto[compiler] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should lock this version as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Locked
* use extra dict * remove commented out code * fix duplicates * add test * consolidate * working, but needs some cleanup * convert back to globals * reset globals in tests * fix up some imports * fix text for windows * modify test for windows
…s/dbt-core into ct-1047-proto_structured_logging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few small things to tweak, overall looks good!
core/dbt/events/README.md
Outdated
|
||
Note that no attributes can exist in these event classes except for fields defined in the protobuf definitions, because the betterproto metaclass will throw an error. Betterproto provides a to_dict() method to convert the generated classes to a dictionary and from that to json. However some attributes will successfully convert to dictionaries but not to serialized protobufs, so we need to test both output formats.jjkkkkkkkkkkj |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing the jjkkkkkkkkkkj
at the end was accidental?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pfft
core/dbt/events/README.md
Outdated
- a string attribute `code`, that's unique across events | ||
- assign a log level by extending `DebugLevel`, `InfoLevel`, `WarnLevel`, or `ErrorLevel` | ||
- a method `code`, that's unique across events | ||
- assign a log level by using the Level misin: `DebugLevel`, `InfoLevel`, `WarnLevel`, or `ErrorLevel` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- assign a log level by using the Level misin: `DebugLevel`, `InfoLevel`, `WarnLevel`, or `ErrorLevel` | |
- assign a log level by using the Level mixin: `DebugLevel`, `InfoLevel`, `WarnLevel`, or `ErrorLevel` |
class AdapterEventInfo(InfoLevel, AdapterEventBase, ShowException): | ||
code: str = "E002" | ||
def message(self): | ||
return f"Running with dbt{self.version}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this include the log version as well? You're passing it in in main.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would change the terminal output, which I don't think is necessary for the log format, just wanted to put it in a structured entry someplace. We can discuss what to do about the log format.
### Description Fixes linter errors caused by [upstream changes](dbt-labs/dbt-core#5954). ``` dbt/adapters/databricks/connections.py:440: error: Argument "conn_name" to "ConnectionUsed" has incompatible type "Optional[str]"; expected "str" [arg-type] dbt/adapters/databricks/connections.py:449: error: Argument "conn_name" to "SQLQuery" has incompatible type "Optional[str]"; expected "str" [arg-type] dbt/adapters/databricks/connections.py:491: error: Argument "conn_name" to "ConnectionUsed" has incompatible type "Optional[str]"; expected "str" [arg-type] dbt/adapters/databricks/connections.py:496: error: Argument "conn_name" to "SQLQuery" has incompatible type "Optional[str]"; expected "str" [arg-type] ```
resolves #5662
Description
Initial changes for using proto message in logging events.
Checklist
changie new
to create a changelog entry