Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce log spam #1252

Closed
humphd opened this issue Oct 29, 2020 · 1 comment · Fixed by #1782
Closed

Reduce log spam #1252

humphd opened this issue Oct 29, 2020 · 1 comment · Fixed by #1782
Assignees
Labels
Milestone

Comments

@humphd
Copy link
Contributor

humphd commented Oct 29, 2020

Our logging is great, but too noisy by default. It's nice to log all kinds of things re: feed parsing and network errors, but they fill the log with spam that doesn't matter when you're developing or trying to debug a problem on production.

Our logger, Pino, has these levels:

  trace: 'DEBUG',
  debug: 'DEBUG',
  info: 'INFO',
  warn: 'WARNING',
  error: 'ERROR',
  fatal: 'CRITICAL',

For production, we probably only care about error/critical, and for normal development, probably only info and up.

Let's audit our logging calls and change the level we use. Especially for the places where we do feed logging, let's make that debug or something vs. info.

@humphd humphd added the type: enhancement New feature or request label Oct 29, 2020
@strawberries73 strawberries73 self-assigned this Nov 25, 2020
@humphd
Copy link
Contributor Author

humphd commented Nov 25, 2020

@Jenn, working on this PR involves the following:

  1. Read the docs for https://getpino.io/ which is the module we use to do our logging.
  2. Try adding pino to your link checker so you get the idea how it works. Once you're comfortable with the concept, move to 3.
  3. Learn about the intention of different log levels in an application. From https://dzone.com/articles/logging-levels-what-they-are-and-how-they-help-you:

FATAL

Fatal represents truly catastrophic situations, as far as your application is concerned. Your application is about to abort to prevent some kind of corruption or serious problem, if possible. This entry in the log should probably result in someone getting a 3 AM phone call.

ERROR

An error is a serious issue and represents the failure of something important going on in your application. Unlike FATAL, the application itself isn't going down the tubes. Here you've got something like dropped database connections or the inability to access a file or service. This will require someone's attention probably sooner than later, but the application can limp along.

WARN

Now we're getting into the grayer area of hypotheticals. You use the WARN log level to indicate that you might have a problem and that you've detected an unusual situation. Maybe you were trying to invoke a service and it failed a couple of times before connecting on an automatic retry. It's unexpected and unusual, but no real harm done, and it's not known whether the issue will persist or recur. Someone should investigate warnings.

INFO

Finally, we can dial down the stress level. INFO messages correspond to normal application behavior and milestones. You probably won't care too much about these entries during normal operations, but they provide the skeleton of what happened. A service started or stopped. You added a new user to the database. That sort of thing.

DEBUG

With DEBUG, you start to include more granular, diagnostic information. Here, you're probably getting into "noisy" territory and furnishing more information than you'd want in normal production situations. You're providing detailed diagnostic information for fellow developers, sysadmins, etc.

TRACE

This is really fine-grained information-finer even than DEBUG. When you're at this level, you're basically looking to capture every detail you possibly can about the application's behavior. This is likely to swamp your resources in production and is seriously diagnostic.

ALL

This is just what it sounds like. Log absolutely everything, including any custom logging levels that someone has defined.

OFF

This is also just what it sounds like. Don't log anything at all.

  1. Learn how to run the Telescope backend using different log levels on your local machine so you can test things as you work. See https://github.com/Seneca-CDOT/telescope/blob/master/docs/logging.md.
  2. Audit all of our logger.<level>(...) calls throughout the code and evaluate them as follows:
    1. Start with all logger.error() and figure out which need to be made into logger.critical(). If we're going to crash, or shutdown due to this error, it's fatal not error.
    2. Next do logger.error() to see what we could downgrade to logger.warn(). A warn means something happened, but the app can recover and keep going. These are unexpected things that happened that we need to know about.
    3. Next do logger.info() to make sure these are all normal conditions for the app vs. `logger.debug(), which should be things only a developer cares about to debug the code.

Start with this and let's discuss as you go. You can ask questions here or in Slack about particular bits of code. There's no "right" way to do this (some judgement is involved), but our overall goal is to reduce the amount of noise we make in production, where we probably only care about warn, error, and critical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants