Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDRIVER-3775 mongoc_structured_log #1795

Open
wants to merge 125 commits into
base: master
Choose a base branch
from
Open

Conversation

mdbmes
Copy link
Contributor

@mdbmes mdbmes commented Nov 19, 2024

This is a revival of an old pull request by @alcaeus to add structured logging to the C driver. (#684)

Structured logging is a new subsystem, independent from unstructured logging. The original PR introduced structured logging with a global scope, similar in design to unstructured logging. This version redesigns it to be per-client or per-pool. Internally, there is a structured logging "instance" owned by the mongoc_topology_t. A separate public mongoc_structured_log_opts_t object provides a configuration API, and the opts can then be passed on to the client or client pool in order to apply new log settings.

The internal API has been redesigned. The previous PR required each new type of log message to have functions and structures associated with that specific message format. In this redesign, I've split the single callback into a variable length list of callbacks, and added a set of macros to make it straightforward to build these function tables. This has some nice properties:

  • Submitting a log entry still doesn't require any deep copies, json serialization, or any memory allocation, only a stack-allocated table of references.
  • Now the mongoc_structured_log() call site explicitly names the keys that are included in the resulting document.
  • Now it's easy to define ad-hoc log entries or add new values to existing log entries.
  • The list-based approach also makes it easy to define reusable building blocks. This is immediately useful for command redaction and for server descriptions.

Here's a sample invocation:

   mongoc_structured_log (
      client->topology->structured_log,
      MONGOC_STRUCTURED_LOG_LEVEL_DEBUG,
      MONGOC_STRUCTURED_LOG_COMPONENT_COMMAND,
      "Command started",
      int32 ("requestId", client->cluster.request_id),
      server_description (server_stream->sd, SERVER_HOST, SERVER_PORT, SERVER_CONNECTION_ID, SERVICE_ID),
      utf8_n ("databaseName", cursor->ns, cursor->dblen),
      utf8 ("commandName", "getMore"),
      int64 ("operationId", cursor->operation_id),
      bson_as_json ("command", &doc));

The macros are explained by doc comments in mongoc-structured-log-private.h.

This PR includes updated unified tests from the command logging and monitoring spec, which now pass. This required several other changes to the unified test runner.

Contents:

  • Implement and document a structured logging facility
  • Structured logging items for command and reply redaction
  • Unified test runner support for: createEntities, observeLogMessages, waitForEvent, $$matchAsDocument, $$matchAsRoot
  • Add serverDescriptionChangedEvent to the unified test runner and unify its two event serialization systems
  • Sync unified tests from the command-logging-and-monitoring spec
  • Enough command logging to pass the CLAM tests
  • bson-dsl support for oid() values
  • Private utilities for dealing with zero'ed oids
  • Minor drive-by cleanup

CDRIVER-3775 is the epic for structured logging in general. This PR currently covers:

@mdbmes
Copy link
Contributor Author

mdbmes commented Dec 4, 2024

I am slightly in favor of making log handlers non-global.

With some more license to depart from the original PR's design, I think I've got a redesign that makes things both simpler and less surprising. Now there's an explicit structured log "instance", owned by the topology.

Threading is vastly simplified because I can take up the existing common restriction in client_pool where settings can't be changed after the first client is created. Instance now has an immutable copy of the log options, readable by all clients in the pool. There's no need for an awkward shadow table of log levels if we want to optimize should_log, because that transformation can be done when the instance is copying its internal opts from the supplied opts.

The documentation is now very explicit about environment variable capture happening when the opts_t is created, either explicitly via the API or internally when a client or pool is created.

@mdbmes mdbmes requested review from kevinAlbs and jmikola December 5, 2024 18:17
Copy link
Collaborator

@kevinAlbs kevinAlbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet fully reviewed. I reviewed headers, tests, and bits of the implementation. I like the concise BSON-DSL-like design for mongoc_structured_log.

src/libmongoc/doc/unstructured_log.rst Outdated Show resolved Hide resolved
src/libmongoc/src/mongoc/mongoc-structured-log.c Outdated Show resolved Hide resolved
src/libmongoc/src/mongoc/mongoc-structured-log.c Outdated Show resolved Hide resolved
src/libmongoc/src/mongoc/mongoc-structured-log.c Outdated Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

src/libmongoc/src/mongoc/mongoc-structured-log-private.h Outdated Show resolved Hide resolved
src/libmongoc/tests/unified/entity-map.c Outdated Show resolved Hide resolved
src/libmongoc/tests/unified/entity-map.c Outdated Show resolved Hide resolved
@kevinAlbs kevinAlbs self-requested a review December 11, 2024 14:43
Copy link
Collaborator

@kevinAlbs kevinAlbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work. LGTM with minor comments addressed. I like the use of atomics to avoid repeated logs from invalid environment variable values.

* and it would be more appropriate for handlers to process their log
* messages concurrently.
*/
pthread_mutex_lock (&handler_mutex);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect this log handler does not require locking. Consider modifying global state to motivate locking. E.g. increment a counter to track calls:

   pthread_mutex_lock (&handler_mutex);
   log_count++;
   pthread_mutex_unlock (&handler_mutex);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, thanks for the suggestion. I did want to illustrate locking in the example and having a real motivation for it would help a lot.

mongoc_ssl_opt_t ssl_opts;
bool ssl_opts_set;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirming: was moving fields to reduce the struct size? Was this discovered with a tool, or -Wpadded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, just moving because order isn't otherwise significant and it bothers me seeing so much padding. I hadn't made any systematic investigation, this is just something I noticed while working on the code.

* and only before the first client is initialized. Structured logging is generally
* expected to warn but not quit when encountering initialization errors. */
if (pool->structured_log_opts_set) {
MONGOC_WARNING ("mongoc_client_pool_set_structured_log_opts can only be called once per pool");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest also returning false on failure. Returning false is consistent with mongoc_client_pool_set_apm_callbacks and may help inform callers of mis-use. Same comment applies to mongoc_client_set_structured_log_opts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I was on the fence about whether these funcs should expose API misuse as a runtime error that would need handling code. (Sometimes a void signature when there are no true runtime errors can be nice.) Happy to make the change for consistency.

@@ -15,12 +15,14 @@
*/


#include "mongoc-apm-private.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert changes to this file? Includes appear not needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, will do.

src/libmongoc/src/mongoc/mongoc-structured-log.c Outdated Show resolved Hide resolved
bool all_ok = true;
mongoc_structured_log_level_t level;

// Errors are not fatal by detault; always reported by return value, and reported the first time only via a log
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Errors are not fatal by detault; always reported by return value, and reported the first time only via a log
// Errors are not fatal by default; always reported by return value, and reported the first time only via a log

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

static int err_count_per_component_atomic[STRUCTURED_LOG_COMPONENT_TABLE_SIZE];

if (_mongoc_structured_log_get_log_level_from_env ("MONGODB_LOG_ALL", &level, &err_count_all_atomic)) {
mongoc_structured_log_opts_set_max_level_for_all_components (opts, level);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mongoc_structured_log_opts_set_max_level_for_all_components (opts, level);
BSON_ASSERT(mongoc_structured_log_opts_set_max_level_for_all_components (opts, level));

Suggest asserting to document no failure is expected. level was validated in _mongoc_structured_log_get_log_level_from_env.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

Comment on lines +321 to +322
mongoc_structured_log_opts_set_max_level_for_component (
opts, (mongoc_structured_log_component_t) component, level);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mongoc_structured_log_opts_set_max_level_for_component (
opts, (mongoc_structured_log_component_t) component, level);
BSON_ASSERT (mongoc_structured_log_opts_set_max_level_for_component (
opts, (mongoc_structured_log_component_t) component, level));

&failure,
MONGOC_ERROR_CONTENT_FLAG_MESSAGE | MONGOC_ERROR_CONTENT_FLAG_CODE |
MONGOC_ERROR_CONTENT_FLAG_DOMAIN);
bson_append_document_end (bson, &failure);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bson_append_document_end (bson, &failure);
bson_append_document_end (bson, &failure);
// Include client-side "errorLabels" if present
bson_iter_t iter;
if (bson_iter_init_find (&iter, reply, "errorLabels")) {
bson_append_iter (&failure, bson_iter_key (&iter), bson_iter_key_len (&iter), &iter);
}

Suggest also including errorLabels if set. I expect errorLabels can be included in client errors:

Any C Driver function that has a bson_t out-parameter named reply may include error labels

This example shows a network error resulting in the command failed event with errorLabels. errorLabels might help identify if the error was transient and/or if a retry was expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants