Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add logging final version #893

Merged

Conversation

benmalef
Copy link
Contributor

@benmalef benmalef commented Jul 2, 2024

Fixes #755

Brief Description

This PR fixes the issues from this PR

I have tried to implement this ref

I have NOT implemented the tqdm ref. I will create a separate PR.

Screenshots

This screenshot shows how logging messages are in the file.

image

Proposed Changes

Checklist

  • CONTRIBUTING guide has been followed.
  • PR is based on the current GaNDLF master .
  • Non-breaking change (does not break existing functionality): provide as many details as possible for any breaking change.
  • Function/class source code documentation added/updated (ensure typing is used to provide type hints, including and not limited to using Optional if a variable has a pre-defined value).
  • Code has been blacked for style consistency and linting.
  • If applicable, version information has been updated in GANDLF/version.py.
  • If adding a git submodule, add to list of exceptions for black styling in pyproject.toml file.
  • Usage documentation has been updated, if appropriate.
  • Tests added or modified to cover the changes; if coverage is reduced, please give explanation.
  • If customized dependency installation is required (i.e., a separate pip install step is needed for PR to be functional), please ensure it is reflected in all the files that control the CI, namely: python-test.yml, and all docker files [1,2,3].

Copy link
Contributor

github-actions bot commented Jul 2, 2024

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

Copy link
Collaborator

@sarthakpati sarthakpati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend adding a Logging section in the documentation for extending GaNDLF detailing this process and how a developer needs to use this correctly. For example, there should be 2 sub-sections to this:

  1. What does someone need to do when they are extending an existing function or class?
  2. What does someone need to do when they are adding a new function or class?
  3. Anything else?

Am I missing something @VukW?

testing/test_full.py Outdated Show resolved Hide resolved
GANDLF/logging_config.yaml Outdated Show resolved Hide resolved
@VukW
Copy link
Contributor

VukW commented Jul 3, 2024

@sarthakpati I agree it would be nice to mention logging in extension guide. However, the whole PR is created in the way that we hide all configuration from the developer - so developer almost doesn't need to bother about it. So, I believe there is not a big need to describe how to create loggers in new or modified code, but instead it would be useful to describe how logging is configured right now and how to log stuff sustainably

### Logging

#### Use loggers instead of print
We use native Python `logging` library for logs management. It is already configured, so if you are extending the code, please use loggers instead of `print` calls.
    ```
def my_new_cool_function(df: pd.DataFrame):
    logger = logging.getLogger(__name__)  # you can use any your own logger name or just pass a current file name
    logger.debug("Message for debug file only")
    logger.info("Hi GaNDLF user, I greet you in the CLI output")
    logger.error(f"A detailed message about any error if needed. Exception: {str(e)}, params: {params}, df shape: {df.shape}")
    # print("Hi GaNDLF user!")  # don't use prints please.
    ```

#### What and where is logged

GaNDLF logs are splitted into multiple parts:
- CLI output: only `info` messages are shown here
- debug file: ...
- errors file: ...

@sarthakpati
Copy link
Collaborator

Sounds good, @VukW! Thank you for the explanation. 😄

@benmalef
Copy link
Contributor Author

benmalef commented Jul 3, 2024

Hi guys @VukW, @sarthakpati,
Thanks for the detailed review.
I agree with it. I will do it.

Co-authored-by: Sarthak Pati <[email protected]>
@benmalef
Copy link
Contributor Author

benmalef commented Jul 4, 2024

@sarthakpati @VukW
I added a Logging section in the documentation for extending GaNDLF

GANDLF/utils/gandlf_logger.py Outdated Show resolved Hide resolved
GANDLF/utils/gandlf_logger.py Outdated Show resolved Hide resolved
Comment on lines 36 to 43
output_dir = Path(log_dir)
Path(output_dir).mkdir(parents=True, exist_ok=True)
with resources.open_text("GANDLF", config_path) as file:
config_dict = yaml.safe_load(file)
config_dict["handlers"]["rotatingFileHandler"]["filename"] = str(
Path.joinpath(output_dir, "gandlf.log")
)
logging.config.dictConfig(config_dict)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, I will suggest the following (pseudo-code):

try:
    write a single line to the `log_file` (something like `"Starting GaNDLF logging session"`).
except:
    # this means that the user does not have write access to the location given by `log_file`, so give that error, and tell the user that we are falling back to the default of flushing output to console
    call logging setup again but with `log_file` as `None`

Copy link
Collaborator

@sarthakpati sarthakpati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, and we should be ready to merge! Thanks a ton for this, @benmalef!

GANDLF/utils/gandlf_logger.py Outdated Show resolved Hide resolved
Comment on lines 60 to 71
try:
if log_file is None: # create tmp file
log_tmp_file = _create_tmp_log_file()
_save_logs_in_file(log_tmp_file, config_path)
logging.info(f"The logs are saved in {log_tmp_file}")
else: # create the log file
_create_log_file(log_file)
_save_logs_in_file(log_file, config_path)
except Exception as e:
_flush_to_console()
logging.error(f"log_file:{e}")
logging.warning("The logs will be flushed to console")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest the following execution order for clarity (also, try is no longer needed since the temp file should always have user-level write access):

Suggested change
try:
if log_file is None: # create tmp file
log_tmp_file = _create_tmp_log_file()
_save_logs_in_file(log_tmp_file, config_path)
logging.info(f"The logs are saved in {log_tmp_file}")
else: # create the log file
_create_log_file(log_file)
_save_logs_in_file(log_file, config_path)
except Exception as e:
_flush_to_console()
logging.error(f"log_file:{e}")
logging.warning("The logs will be flushed to console")
log_tmp_file = log_file
if log_file is None: # create tmp file
log_tmp_file = _create_tmp_log_file()
logging.info(f"The logs are saved in {log_tmp_file}")
_create_log_file(log_tmp_file)
_save_logs_in_file(log_tmp_file, config_path)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this code is cleaner. Thanks for the refactor.!

Comment on lines 10 to 25
def _flush_to_console():
formatter = colorlog.ColoredFormatter(
"%(log_color)s%(asctime)s - %(levelname)s - %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
log_colors={
"DEBUG": "blue",
"INFO": "green",
"WARNING": "yellow",
"ERROR": "red",
"CRITICAL": "bold_red",
},
)
console_handler = logging.StreamHandler()
console_handler.setFormatter(formatter)
logging.root.setLevel(logging.DEBUG)
logging.root.addHandler(console_handler)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the try block is removed from the module below, then this function is no longer needed, right?

Suggested change
def _flush_to_console():
formatter = colorlog.ColoredFormatter(
"%(log_color)s%(asctime)s - %(levelname)s - %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
log_colors={
"DEBUG": "blue",
"INFO": "green",
"WARNING": "yellow",
"ERROR": "red",
"CRITICAL": "bold_red",
},
)
console_handler = logging.StreamHandler()
console_handler.setFormatter(formatter)
logging.root.setLevel(logging.DEBUG)
logging.root.addHandler(console_handler)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes if we don't want to flush to the console, the function is no longer needed.

@benmalef
Copy link
Contributor Author

@sarthakpati I made the proposed changes...!! Thanks a lot for the suggestions.!!

Copy link
Collaborator

@sarthakpati sarthakpati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor semantic change. This PR should be good to merge after this.

GANDLF/utils/gandlf_logger.py Outdated Show resolved Hide resolved
GANDLF/utils/gandlf_logger.py Outdated Show resolved Hide resolved
Copy link

codecov bot commented Jul 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.46%. Comparing base (b6dfe2d) to head (f425289).

Additional details and impacted files
@@                   Coverage Diff                   @@
##           new-apis_v0.1.0-dev     #893      +/-   ##
=======================================================
+ Coverage                94.41%   94.46%   +0.04%     
=======================================================
  Files                      159      160       +1     
  Lines                     9387     9482      +95     
=======================================================
+ Hits                      8863     8957      +94     
- Misses                     524      525       +1     
Flag Coverage Δ
unittests 94.46% <100.00%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -24,7 +24,8 @@ def gandlf(ctx, loglevel):
"""GANDLF command-line tool."""
ctx.ensure_object(dict)
ctx.obj["LOGLEVEL"] = loglevel
setup_logging(loglevel)
# setup_logging(loglevel)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove redundant old loglevel stuff here

tmp_dir = Path(tempfile.gettempdir())
log_dir = Path.joinpath(tmp_dir, ".gandlf")
log_dir.mkdir(parents=True, exist_ok=True)
log_file = Path.joinpath(log_dir, get_unique_timestamp() + ".log")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

log_file.write_text("Starting GaNDLF logging session \n")


def _save_logs_in_file(log_file, config_path):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a nitpick - this function is not saving any logs, instead just configures logging. Maybe rename it to smth more relevant? For ex,

Suggested change
def _save_logs_in_file(log_file, config_path):
def _configure_logging_with_logfile(log_file, config_path):

Copy link
Contributor

@VukW VukW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked the overall code, no any significant issues, looks good to me! Thanks, man @benmalef

@sarthakpati
Copy link
Collaborator

The recent changes are good for me to merge. @VukW, if you are okay as well, let's merge this in and start the process of migrating the current master to an old_api branch and move on?

Copy link
Contributor

@VukW VukW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sarthakpati Agree, looks good to me, let's merge it (just to remind, I believe @benmalef cannot merge it till your previous PR review result Request Changes is active)

@sarthakpati sarthakpati merged commit e36f274 into mlcommons:new-apis_v0.1.0-dev Jul 24, 2024
19 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Jul 24, 2024
@benmalef benmalef deleted the add_logging_final_version branch September 13, 2024 06:27
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants