Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/logger and api logger #69

Merged
merged 10 commits into from
Nov 20, 2020
Merged

Feat/logger and api logger #69

merged 10 commits into from
Nov 20, 2020

Conversation

kwajiehao
Copy link
Contributor

Overview

This PR goes some way to addressing the logging needs of the project. It accomplishes the following:

  1. Add a CloudWatchLogger class which uses the Winston package to log directly to CloudWatch
  2. Add an apiLogger middleware which logs all API requests to our server
  3. Retrieve a user's GitHub user ID during the authentication phase and attaches it to their JWT

These changes help with log reliability, auditing, and log identifiability.

  • Reliability: elastic beanstalk instances batch log statements before sending them to CloudWatch with its own daemon process. If the instance dies, then we might lose logs which have been batched up but not sent. Using an external logger such as Winston allows us to log directly to CloudWatch instantaneously instead of being afraid of losing logs when an instance dies.
  • Audit: by logging every API request made, we can maintain an audit log of what every user has done, and when. This is important since the government often conducts audits and needs audit logs for administrative purposes.
  • Identifiability: currently, we are unable to identify users from their API requests, since the only information we have about them is their access token, which is a random jumble of characters. By retrieving their user id, we can reconcile actions taken with the identity of the user taking that action, which helps with the audit process.

Copy link
Contributor

@alexanderleegs alexanderleegs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left comments. Also, regarding the information that we log in the API logger - does the request body have any important information we should be logging? It's mostly made of encrypted content, which may not be very useful in debugging - let me know what you think about this

middleware/auth.js Outdated Show resolved Hide resolved
Jie Hao Kwa added 9 commits November 20, 2020 10:08
This commit installs the necessary dependencies for the logger
component. The reason we need a separate logger, which logs to
CloudWatch directly, is because the EC2 instance that our server
is running on might batches log statements before sending them to
CloudWatch. Logs might fail to get sent if the EC2 cloudwatch daemon
is corrupted or fails in some way. This external logger will send
logs instantly to CloudWatch so we have a fallback source of logs.

We chose to use the Winston logger as we have had success using it
for the homer project. this commit also installs the aws sdk as we
need to retrieve the EC2 instance id to log to the correct
cloudwatch log group.
This commit adds a CloudWatchLogger class which uses the singleton
pattern. Since a logger does not hold any state, it makes sense to
use the singleton pattern.

This logger attempts to connect to and log directly to cloudwatch
if in a production environment (NODE_ENV = prod or staging) and
logs directly to console if in dev mode. In other words, in a prod
environment we log both directly to cloudwatch and to console, so we
have two sets of logs as a fallback mechanism. In particular, our
cloudwatch only logs provide a clean application log which make it
easier to debug our own operations.
This commit removes try-catch statements that we were supposed to
remove in PR #63 but which we forgot to do so. These blocks are
no longer needed since we have a central error handler which deals
with errors.
This commit replaces existing usage of console.log with our
new logger class' `logger.info` method
This commit retrieves a user's GitHub id during the login
authentication stage for auditing purposes so that we can
log user actions when they perform actions on the CMS.
This commit extracts the newly added user_id attribute from the JWT
during the request authentication stage and attaches it to the request
object. This allows us to identify user logs by their GitHub user id
instead of an unidentifiable access token string.
Previously, we erroneously assigned the logout endpoint the `noVerify`
auth function. However, the logout endpoint should not be accesible
to users who have not logged in, therefore it should be a protected
endpoint.
This commit adds an api logger middleware which logs all api requests
that have been submitted to the server for auditing purposes. Note that
this middleware is added AFTER the auth middleware instead of before -
that's because we're reliant on the auth middleware to add the user id
to the request object which will allow our api logs to be identifiable.
This commit modifies the API logger so that we log the request body.
The request body usually contains encrypted post content, and the relevant
file sha, which isn't important to us when we're sifting the logs.
Furthermore, we can identify the files being changes simply by looking
at the request path and date-time of the request - from these, we would
be able to identify the file being changed and also identify the file
version from the date-time of teh request.
that useful to us when we
@kwajiehao
Copy link
Contributor Author

Left comments. Also, regarding the information that we log in the API logger - does the request body have any important information we should be logging? It's mostly made of encrypted content, which may not be very useful in debugging - let me know what you think about this

Removed logging of request body in 71f2ffc

Currently, the logout endpoint is protected by the `verifyJwt` auth
function. However, this prevents auto-logout functionality for when
a user's token expires, leaving them stuck in the UI with a useless
JWT which is incapable of making successful API calls.

This commit reverts the endpoint to be unprotected to fix this problem.
Copy link
Contributor

@alexanderleegs alexanderleegs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@kwajiehao kwajiehao merged commit 006f0fd into staging Nov 20, 2020
@kwajiehao kwajiehao deleted the feat/api-logger branch November 20, 2020 04:45
harishv7 pushed a commit that referenced this pull request Feb 17, 2023
* chore: install dependencies needed for logger component

This commit installs the necessary dependencies for the logger
component. The reason we need a separate logger, which logs to
CloudWatch directly, is because the EC2 instance that our server
is running on might batches log statements before sending them to
CloudWatch. Logs might fail to get sent if the EC2 cloudwatch daemon
is corrupted or fails in some way. This external logger will send
logs instantly to CloudWatch so we have a fallback source of logs.

We chose to use the Winston logger as we have had success using it
for the homer project. this commit also installs the aws sdk as we
need to retrieve the EC2 instance id to log to the correct
cloudwatch log group.

* feat: add singleton CloudWatchLogger class

This commit adds a CloudWatchLogger class which uses the singleton
pattern. Since a logger does not hold any state, it makes sense to
use the singleton pattern.

This logger attempts to connect to and log directly to cloudwatch
if in a production environment (NODE_ENV = prod or staging) and
logs directly to console if in dev mode. In other words, in a prod
environment we log both directly to cloudwatch and to console, so we
have two sets of logs as a fallback mechanism. In particular, our
cloudwatch only logs provide a clean application log which make it
easier to debug our own operations.

* chore: remove try-catch statements

This commit removes try-catch statements that we were supposed to
remove in PR #63 but which we forgot to do so. These blocks are
no longer needed since we have a central error handler which deals
with errors.

* feat: replace existing usage of console.log

This commit replaces existing usage of console.log with our
new logger class' `logger.info` method

* feat: retrieve user id for logging purposes

This commit retrieves a user's GitHub id during the login
authentication stage for auditing purposes so that we can
log user actions when they perform actions on the CMS.

* feat: add userId attribute to request object in auth middleware

This commit extracts the newly added user_id attribute from the JWT
during the request authentication stage and attaches it to the request
object. This allows us to identify user logs by their GitHub user id
instead of an unidentifiable access token string.

* fix: assign verifyJwt auth function to logout endpoint

Previously, we erroneously assigned the logout endpoint the `noVerify`
auth function. However, the logout endpoint should not be accesible
to users who have not logged in, therefore it should be a protected
endpoint.

* feat: add api-logging middleware

This commit adds an api logger middleware which logs all api requests
that have been submitted to the server for auditing purposes. Note that
this middleware is added AFTER the auth middleware instead of before -
that's because we're reliant on the auth middleware to add the user id
to the request object which will allow our api logs to be identifiable.

* fix: remove logging of request body

This commit modifies the API logger so that we log the request body.
The request body usually contains encrypted post content, and the relevant
file sha, which isn't important to us when we're sifting the logs.
Furthermore, we can identify the files being changes simply by looking
at the request path and date-time of the request - from these, we would
be able to identify the file being changed and also identify the file
version from the date-time of teh request.
that useful to us when we

* fix: unprotect logout endpoint

Currently, the logout endpoint is protected by the `verifyJwt` auth
function. However, this prevents auto-logout functionality for when
a user's token expires, leaving them stuck in the UI with a useless
JWT which is incapable of making successful API calls.

This commit reverts the endpoint to be unprotected to fix this problem.

Co-authored-by: Jie Hao Kwa <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants