Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: event log entries for ALTER/DROP TYPE do not properly include the fully qualified type name #57734

Closed
knz opened this issue Dec 9, 2020 · 3 comments · Fixed by #58257
Closed
Labels
A-logging In and around the logging infrastructure. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Comments

@knz
Copy link
Contributor

knz commented Dec 9, 2020

Compare in system.eventlog:

 drop_table          |       54 |           1 | {"TableName":"defaultdb.public.t","Statement":"DROP TABLE t","User":"demo","CascadeDroppedViews":null}

 drop_type           |       52 |           1 | {"TypeName":"t","Statement":"DROP TYPE defaultdb.public.t","User":"demo"}

 alter_type           |       58 |           1 | {"TypeName":"g","Statement":"ALTER TYPE defaultdb.public.f RENAME TO g","User":"demo"}

The TypeName field is incompletely populated.

All the while, the field for CREATE TYPE is in fact properly populated.

For information, the call to p.ResolveMutableTypeDescriptor() in drop_type.go modifies its name argument in-place to produce the fully qualified name. It is thus possible to use the resulting name as argument in the event log.

However this solution is incomplete, because that only gives the name for the "main" type. The FQ name for the associated array type cannot be obtained this way.

@knz knz added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-logging In and around the logging infrastructure. labels Dec 9, 2020
@knz
Copy link
Contributor Author

knz commented Dec 9, 2020

cc @thtruo - this is needed to properly productionize our DDL logging. Can you chase that up with the SQL teams?

@knz knz changed the title sql: event log entries for DROP type do not properly include the fully qualified type name sql: event log entries for DROP tyPE do not properly include the fully qualified type name Dec 9, 2020
@knz knz changed the title sql: event log entries for DROP tyPE do not properly include the fully qualified type name sql: event log entries for DROP TYPE do not properly include the fully qualified type name Dec 9, 2020
knz added a commit to knz/cockroach that referenced this issue Dec 9, 2020
CockroachDB needs to report structured events for "important" events
on the logical structure of a cluster, including changes to the SQL
logical schema, node activity, privilege changes etc.

Prior to this patch, these events were reported mainly into the
table `system.eventlog`, with a partial copy of the payload into the
debugging external log (`cockroach.log`).

This solution was incomplete and unsatisfactory in many ways:

- the event payloads were not documented.

- the event payloads were not centrally defined, which was preventing
  the generation of automatic documentation.

- the payload types were declared "inline" in the log calls, which
  made it easy for team members to inadvertently change the structure of
  the payload and make them backward-incompatible for users consuming
  this data externally.

- the payload fields were inconsistently named across event types.

- the metadata fields on the payloads were incompletely and
  inconsistently populated:

  - the SQL instance ID was missing in some cases.
  - the descriptor ID of affected descriptor was missing in some
    cases.

- the same event type was mistakenly used for different events (e.g.
  "rename_database" for both RENAME DATABASE and CONVERT TO SCHEMA)

- the same event type was abusingly over-used for multiple separate
  operations, e.g. a single event would be generated for a multi-table,
  multi-user GRANT or REVOKE operation.

- the copy in the external log was not parseable. Generally, the
  logging package was unaware of the internal structure of events
  and would “flatten” them.

- no provision was available to partially redact events. From the
  logging system's perspective, the entire payload is sensitive.

This commit changes the situation as follows:

- it centralizes the payload definitions and standardizes them into a
  new package `eventspb`.
- it enables automatic generation of documentation for events.
- it ensures that field names are consistent across event payloads.
- it ensures that event metadata is consistently populted.
- it decomposes complex GRANT/REVOKE operations into individual
  events.

(FIXME - remaining to be done:)

- it automates the generation of a reference documentation for all
  event types.
- it provide a guardrail against the introduction of new DDL
  statements without a corresponding event log.

The following problems continue to exist and need to be resolved
separately:

- privilege changes that occur as a side effect of certain operations
  do not get events logged:

  cockroachdb#57573
  cockroachdb#57576

- the name fields in certain DDL events is not properly qualified,
  which prevents the determination of the logical schema or database
  where the object was altered:

  cockroachdb#57734
  cockroachdb#57735

Release note (sql change): FIXME
@knz knz changed the title sql: event log entries for DROP TYPE do not properly include the fully qualified type name sql: event log entries for ALTER/DROP TYPE do not properly include the fully qualified type name Dec 9, 2020
knz added a commit to knz/cockroach that referenced this issue Dec 9, 2020
CockroachDB needs to report structured events for "important" events
on the logical structure of a cluster, including changes to the SQL
logical schema, node activity, privilege changes etc.

Prior to this patch, these events were reported mainly into the
table `system.eventlog`, with a partial copy of the payload into the
debugging external log (`cockroach.log`).

This solution was incomplete and unsatisfactory in many ways:

- the event payloads were not documented.

- the event payloads were not centrally defined, which was preventing
  the generation of automatic documentation.

- the payload types were declared "inline" in the log calls, which
  made it easy for team members to inadvertently change the structure of
  the payload and make them backward-incompatible for users consuming
  this data externally.

- the payload fields were inconsistently named across event types.

- the metadata fields on the payloads were incompletely and
  inconsistently populated:

  - the SQL instance ID was missing in some cases.
  - the descriptor ID of affected descriptor was missing in some
    cases.

- the same event type was mistakenly used for different events (e.g.
  "rename_database" for both RENAME DATABASE and CONVERT TO SCHEMA)

- the same event type was abusingly over-used for multiple separate
  operations, e.g. a single event would be generated for a multi-table,
  multi-user GRANT or REVOKE operation.

- the copy in the external log was not parseable. Generally, the
  logging package was unaware of the internal structure of events
  and would “flatten” them.

- no provision was available to partially redact events. From the
  logging system's perspective, the entire payload is sensitive.

This commit changes the situation as follows:

- it centralizes the payload definitions and standardizes them into a
  new package `eventspb`.
- it enables automatic generation of documentation for events.
- it ensures that field names are consistent across event payloads.
- it ensures that event metadata is consistently populted.
- it decomposes complex GRANT/REVOKE operations into individual
  events.

(FIXME - remaining to be done:)

- it automates the generation of a reference documentation for all
  event types.
- it provide a guardrail against the introduction of new DDL
  statements without a corresponding event log.

The following problems continue to exist and need to be resolved
separately:

- privilege changes that occur as a side effect of certain operations
  do not get events logged:

  cockroachdb#57573
  cockroachdb#57576
  cockroachdb#57739
  cockroachdb#57741

- the name fields in certain DDL events is not properly qualified,
  which prevents the determination of the logical schema or database
  where the object was altered:

  cockroachdb#57734
  cockroachdb#57735
  cockroachdb#57738
  cockroachdb#57740

Release note (sql change): FIXME
@thtruo
Copy link
Contributor

thtruo commented Dec 9, 2020

cc @vy-ton putting this on your radar for SQL schema. This is a bug Raphael identified while standardizing system.eventlog

knz added a commit to knz/cockroach that referenced this issue Dec 10, 2020
CockroachDB needs to report structured events for "important" events
on the logical structure of a cluster, including changes to the SQL
logical schema, node activity, privilege changes etc.

Prior to this patch, these events were reported mainly into the
table `system.eventlog`, with a partial copy of the payload into the
debugging external log (`cockroach.log`).

This solution was incomplete and unsatisfactory in many ways:

- the event payloads were not documented.

- the event payloads were not centrally defined, which was preventing
  the generation of automatic documentation.

- the payload types were declared "inline" in the log calls, which
  made it easy for team members to inadvertently change the structure of
  the payload and make them backward-incompatible for users consuming
  this data externally.

- the payload fields were inconsistently named across event types.

- the metadata fields on the payloads were incompletely and
  inconsistently populated:

  - the SQL instance ID was missing in some cases.
  - the descriptor ID of affected descriptor was missing in some
    cases.

- the same event type was mistakenly used for different events (e.g.
  "rename_database" for both RENAME DATABASE and CONVERT TO SCHEMA)

- the same event type was abusingly over-used for multiple separate
  operations, e.g. a single event would be generated for a multi-table,
  multi-user GRANT or REVOKE operation.

- the copy in the external log was not parseable. Generally, the
  logging package was unaware of the internal structure of events
  and would “flatten” them.

- no provision was available to partially redact events. From the
  logging system's perspective, the entire payload is sensitive.

This commit changes the situation as follows:

- it centralizes the payload definitions and standardizes them into a
  new package `eventspb`.
- it enables automatic generation of documentation for events.
- it ensures that field names are consistent across event payloads.
- it ensures that event metadata is consistently populted.
- it decomposes complex GRANT/REVOKE operations into individual
  events.
- it automates the generation of a reference documentation for all
  event types.

(FIXME - remaining to be done:)

- it provide a guardrail against the introduction of new DDL
  statements without a corresponding event log.

The following problems continue to exist and need to be resolved
separately:

- privilege changes that occur as a side effect of certain operations
  do not get events logged:

  cockroachdb#57573
  cockroachdb#57576
  cockroachdb#57739
  cockroachdb#57741

- the name fields in certain DDL events is not properly qualified,
  which prevents the determination of the logical schema or database
  where the object was altered:

  cockroachdb#57734
  cockroachdb#57735
  cockroachdb#57738
  cockroachdb#57740

Release note (sql change): The cluster event logging system has been
modernized. In particular, the schema of the entries for the `info`
column in `system.eventlog` has been stabilized. See the documentation
for details.

Release note (sql change): The `targetID` and `reportingID` columns in
`system.eventlog` are now deprecated. Their values, for relevant event
types, can be found as fields inside the `info` column instead.
knz added a commit to knz/cockroach that referenced this issue Dec 11, 2020
CockroachDB needs to report structured events for
"important" (notable) events on the logical structure of a cluster,
including changes to the SQL logical schema, node activity, privilege
changes etc.

Prior to this patch, these events were reported mainly into the
table `system.eventlog`, with a partial copy of the payload into the
debugging external log (`cockroach.log`).

This solution was incomplete and unsatisfactory in many ways:

- the event payloads were not documented.
- the event payloads were not centrally defined, which was preventing
  the generation of automatic documentation.
- the payload types were declared "inline" in the log calls, which
  made it easy for team members to inadvertently change the structure of
  the payload and make them backward-incompatible for users consuming
  this data externally.
- the payload fields were inconsistently named across event types.
- the metadata fields on the payloads were incompletely and
  inconsistently populated:

  - the SQL instance ID was missing in some cases.
  - the descriptor ID of affected descriptor was missing in some
    cases.

- the same event type was mistakenly used for different events (e.g.
  "rename_database" for both RENAME DATABASE and CONVERT TO SCHEMA)
- the same event type was abusingly over-used for multiple separate
  operations, e.g. a single event would be generated for a multi-table,
  multi-user GRANT or REVOKE operation.
- events could be omitted entirely from external logs, e.g.
  if the system.eventlog table was not available during a node
  restart.
- the copy in the external log was not parseable. Generally, the
  logging package was unaware of the internal structure of events
  and would “flatten” them.
- no provision was available to partially redact events. From the
  logging system's perspective, the entire payload is sensitive.

This commit changes the situation as follows:

- it centralizes the payload definitions and standardizes them into a
  new package `eventspb`.
- it groups events into categories and associates logging
  channels to each event category.
- it enables automatic generation of documentation for events.
- it ensures that field names are consistent across event payloads.
- it ensures that event metadata is consistently populted.
- it decomposes complex GRANT/REVOKE operations into individual
  events.
- it automates the generation of a reference documentation for all
  event types.
- it ensures that events are sent to external logs unconditionally.

The following problems continue to exist and need to be resolved
separately:

- privilege changes that occur as a side effect of certain operations
  do not get events logged:

  cockroachdb#57573
  cockroachdb#57576
  cockroachdb#57739
  cockroachdb#57741

- the name fields in certain DDL events is not properly qualified,
  which prevents the determination of the logical schema or database
  where the object was altered:

  cockroachdb#57734
  cockroachdb#57735
  cockroachdb#57738
  cockroachdb#57740

Release note (sql change): The cluster event logging system has been
modernized. In particular, the schema of the entries for the `info`
column in `system.eventlog` has been stabilized.

Release note (cli change): The events that were previously only stored
in `system.eventlog` are now also directed unconditionally to an
external logging channel using a JSON format.  Refer to the
configuration to see how to customize how events are directed to
external sinks. Note that the exact external output format (and thus
how to detect/parse the events from e.g. log files) is not yet
stabilized and remains subject to change.

Release note (doc change): The cluster event logging system has been
standardized. A reference documentation is now
available (auto-generated from source code); changes to non-reserved
payloads will now be announced at least one release version in
advance. The event types are organized into broad categories: SQL
Logical Schema Changes, SQL Privilege Changes, SQL User Management,
CLuster-level events and SQL Miscellaneous operations.

Release note (backward-incompatible change): The payload fields for
certain event types in `system.eventlog` have been changed and/or
renamed. Note that the payloads in `system.eventlog` were an
undocumented, reserved feature so no guarantee was made about
cross-version compatibility to this point. The list of changes
includes (but is not limited to):

- `TargetID` has been renamed to `NodeID` for `node_join`
- `TargetID` has been renamed to `TargetNodeID` for
  `node_decommissioning` / `node_decommissioned` /
  `node_recommissioned`.
- `NewDatabaseName` has been renamed to `NewDatabaseParent` for
  `convert_to_schema`.
- `grant_privilege` and `revoke_privilege` have been removed;
  they are replaced by `change_database_privilege`,
  `change_schema_privilege`, `change_type_privilege` and
  `change_table_privilege`.
  Each event only reports change for one user/role, so the
  `Grantees` field was renamed to `Grantee`.
- Each `drop_role` event now pertains to a single user/role.

Release note (sql change): The `targetID` and `reportingID` columns in
`system.eventlog` are now deprecated. Their values, for relevant event
types, can be found as fields inside the `info` column instead.
craig bot pushed a commit that referenced this issue Dec 11, 2020
57737: sql,log: productionize the event logging r=itsbilal a=knz

Informs #57629

CockroachDB needs to report structured events for
"important" (notable) events on the logical structure of a cluster,
including changes to the SQL logical schema, node activity, privilege
changes etc.

Prior to this patch, these events were reported mainly into the
table `system.eventlog`, with a partial copy of the payload into the
debugging external log (`cockroach.log`).

This solution was incomplete and unsatisfactory in many ways:

- the event payloads were not documented.
- the event payloads were not centrally defined, which was preventing
  the generation of automatic documentation.
- the payload types were declared "inline" in the log calls, which
  made it easy for team members to inadvertently change the structure of
  the payload and make them backward-incompatible for users consuming
  this data externally.
- the payload fields were inconsistently named across event types.
- the metadata fields on the payloads were incompletely and
  inconsistently populated:

  - the SQL instance ID was missing in some cases.
  - the descriptor ID of affected descriptor was missing in some
    cases.

- the same event type was mistakenly used for different events (e.g.
  "rename_database" for both RENAME DATABASE and CONVERT TO SCHEMA)
- the same event type was abusingly over-used for multiple separate
  operations, e.g. a single event would be generated for a multi-table,
  multi-user GRANT or REVOKE operation.
- events could be omitted entirely from external logs, e.g.
  if the system.eventlog table was not available during a node
  restart.
- the copy in the external log was not parseable. Generally, the
  logging package was unaware of the internal structure of events
  and would “flatten” them.
- no provision was available to partially redact events. From the
  logging system's perspective, the entire payload is sensitive.

This commit changes the situation as follows:

- it centralizes the payload definitions and standardizes them into a
  new package `eventspb`.
- it groups events into categories and associates logging
  channels to each event category.
- it enables automatic generation of documentation for events.
- it ensures that field names are consistent across event payloads.
- it ensures that event metadata is consistently populted.
- it decomposes complex GRANT/REVOKE operations into individual
  events.
- it automates the generation of a reference documentation for all
  event types.
- it ensures that events are sent to external logs unconditionally.

The following problems continue to exist and need to be resolved
separately:

- privilege changes that occur as a side effect of certain operations
  do not get events logged:

  #57573
  #57576
  #57739
  #57741

- the name fields in certain DDL events is not properly qualified,
  which prevents the determination of the logical schema or database
  where the object was altered:

  #57734
  #57735
  #57738
  #57740

Release note (sql change): The cluster event logging system has been
modernized. In particular, the schema of the entries for the `info`
column in `system.eventlog` has been stabilized.

Release note (cli change): The events that were previously only stored
in `system.eventlog` are now also directed unconditionally to an
external logging channel using a JSON format.  Refer to the
configuration to see how to customize how events are directed to
external sinks. Note that the exact external output format (and thus
how to detect/parse the events from e.g. log files) is not yet
stabilized and remains subject to change.

Release note (doc change): The cluster event logging system has been
standardized. A reference documentation is now
available (auto-generated from source code); changes to non-reserved
payloads will now be announced at least one release version in
advance. The event types are organized into broad categories: SQL
Logical Schema Changes, SQL Privilege Changes, SQL User Management,
CLuster-level events and SQL Miscellaneous operations.

Release note (backward-incompatible change): The payload fields for
certain event types in `system.eventlog` have been changed and/or
renamed. Note that the payloads in `system.eventlog` were an
undocumented, reserved feature so no guarantee was made about
cross-version compatibility to this point. The list of changes
includes (but is not limited to):

- `TargetID` has been renamed to `NodeID` for `node_join`
- `TargetID` has been renamed to `TargetNodeID` for
  `node_decommissioning` / `node_decommissioned` /
  `node_recommissioned`.
- `NewDatabaseName` has been renamed to `NewDatabaseParent` for
  `convert_to_schema`.
- `grant_privilege` and `revoke_privilege` have been removed;
  they are replaced by `change_database_privilege`,
  `change_schema_privilege`, `change_type_privilege` and
  `change_table_privilege`.
  Each event only reports change for one user/role, so the
  `Grantees` field was renamed to `Grantee`.
- Each `drop_role` event now pertains to a single user/role.

Release note (sql change): The `targetID` and `reportingID` columns in
`system.eventlog` are now deprecated. Their values, for relevant event
types, can be found as fields inside the `info` column instead.

57752: kvserver: only accept raw engines for WriteInitialClusterVersion r=irfansharif a=irfansharif

We have an invariant to maintain around ensuring that all writes to this
key are durably persisted. Let's update our signature to reflect as
much.

Release note: None

Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: irfan sharif <[email protected]>
@knz
Copy link
Contributor Author

knz commented Dec 22, 2020

Discussed with @aaron-crl : lower priority.

craig bot pushed a commit that referenced this issue Jan 5, 2021
58257: sql: ensure event log entries for ALTER/DROP TYPE are fully qualified r=knz a=rohany

Fixes #57734.

Release note (bug fix): Fully qualified names in the `TypeName` field of
the event log for `ALTER TYPE` and `DROP TYPE` statements.

Co-authored-by: Rohan Yadav <[email protected]>
@craig craig bot closed this as completed in 9db726a Jan 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-logging In and around the logging infrastructure. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants