Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add access log gRPC interceptor & Update observability features #852

Closed
wants to merge 15 commits into from

Conversation

rinx
Copy link
Contributor

@rinx rinx commented Nov 19, 2020

Description:

Changes

  • Add access log gRPC interceptor
    • Add internal/net/grpc/interceptor/server/logging package
    • By appending "AccessLogInterceptor" to server_config.servers.grpc.server.grpc.interceptors field in config.yaml, it will be enabled.
{"level":"info","ts":1606366367.1742995,"msg":"rpc completed","details":{"grpc":{"kind":"unary","service":"core.Agent","method":"IndexInfo"},"startTime":1606366367.1742105,"latency":0.0000113,"traceID":"bd2375a0b3aa4a2607d042416c56420d"}}
  • Add zap logger

    • it is better JSON formatting than the existing logging library.
  • Update Grafana dashboards

  • Add values-observability-demo.yaml to charts/vald

Known problem

  • when logger=glg and format=raw, it cannot print a valid structured log. please use zap instead or disable AccessLogInterceptor.
2020-11-26 04:50:33     [INFO]: rpc completed &{0xc00024cc30 1.6063662337968597e+09 1.45e-05 86c42bd1164bf434f95b3f51f9d5b755 <nil>}

Related Issue:

nothing

How Has This Been Tested?:

nothing

Environment:

  • Go Version: 1.15.2
  • Docker Version: 19.03.8
  • Kubernetes Version: 1.18.2
  • NGT Version: 1.12.1

Types of changes:

  • Bug fix [type/bug]
  • New feature [type/feature]
  • Add tests [type/test]
  • Security related changes [type/security]
  • Add documents [type/documentation]
  • Refactoring [type/refactoring]
  • Update dependencies [type/dependency]
  • Update benchmarks and performances [type/bench]
  • Update CI [type/ci]

Changes to Core Features:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your core changes, as applicable?
  • Have you successfully ran tests with your changes locally?

Checklist:

  • I have read the CONTRIBUTING document.
  • I have checked open Pull Requests for the similar feature or fixes?
  • I have added tests and benchmarks to cover my changes.
  • I have ensured all new and existing tests passed.
  • I have commented my code, particularly in hard-to-understand areas
  • I have updated the documentation accordingly.

@vdaas-ci
Copy link
Collaborator

[CHATOPS:HELP] ChatOps commands.

  • 🙆‍♀️ /approve - approve
  • 💌 /changelog - add changelog comment
  • 🍱 /format - format codes and add licenses
  • /gen-test - generate test codes
  • 🏷️ /label - add labels
  • /rebase - rebase master

@codecov
Copy link

codecov bot commented Nov 19, 2020

Codecov Report

Merging #852 (3e558e5) into master (7a78959) will increase coverage by 0.11%.
The diff coverage is 77.57%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #852      +/-   ##
==========================================
+ Coverage   15.93%   16.05%   +0.11%     
==========================================
  Files         474      475       +1     
  Lines       23693    23786      +93     
==========================================
+ Hits         3776     3818      +42     
- Misses      19680    19722      +42     
- Partials      237      246       +9     
Impacted Files Coverage Δ
internal/log/format/format.go 100.00% <ø> (ø)
internal/log/logger/type.go 100.00% <ø> (ø)
...ternal/observability/metrics/agent/core/ngt/ngt.go 0.00% <0.00%> (ø)
internal/log/zap/zap.go 94.20% <94.20%> (ø)
internal/log/zap/option.go 100.00% <100.00%> (ø)
internal/servers/server/option.go 95.55% <100.00%> (+0.20%) ⬆️
internal/db/storage/blob/s3/reader/reader.go 63.04% <0.00%> (-31.53%) ⬇️
internal/worker/queue.go 95.00% <0.00%> (-3.34%) ⬇️
internal/log/option.go
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e1941bd...3e558e5. Read the comment docs.

@rinx rinx force-pushed the feature/internal/add-access-log-grpc-interceptor branch from 1663540 to f7dcc29 Compare November 25, 2020 01:13
@github-actions github-actions bot added team/set SET team size/XL and removed size/L labels Nov 25, 2020
internal/log/zap/zap.go Show resolved Hide resolved
internal/log/zap/zap.go Show resolved Hide resolved
@rinx rinx force-pushed the feature/internal/add-access-log-grpc-interceptor branch from f7dcc29 to a2329f0 Compare November 25, 2020 01:23
@rinx rinx force-pushed the feature/internal/add-access-log-grpc-interceptor branch from a2329f0 to 55b731a Compare November 25, 2020 08:35
@github-actions github-actions bot added size/XXL and removed size/XL labels Nov 25, 2020
@rinx rinx force-pushed the feature/internal/add-access-log-grpc-interceptor branch 3 times, most recently from 13f0884 to eb7a89f Compare November 26, 2020 02:47
@rinx rinx force-pushed the feature/internal/add-access-log-grpc-interceptor branch 2 times, most recently from 89771b9 to 1444329 Compare November 26, 2020 06:16
import (
"github.com/vdaas/vald/internal/log/format"
"github.com/vdaas/vald/internal/log/level"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[golangci] reported by reviewdog 🐶
File is not gci-ed (gci)

type Option func(l *logger)

var (
defaultOpts = []Option{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[golangci] reported by reviewdog 🐶
defaultOpts is a global variable (gochecknoglobals)

)

var (
zapcore_NewConsoleEncoder = zapcore.NewConsoleEncoder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[golangci] reported by reviewdog 🐶
zapcore_NewConsoleEncoder is a global variable (gochecknoglobals)


var (
zapcore_NewConsoleEncoder = zapcore.NewConsoleEncoder
zapcore_NewJSONEncoder = zapcore.NewJSONEncoder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[golangci] reported by reviewdog 🐶
zapcore_NewJSONEncoder is a global variable (gochecknoglobals)

@@ -0,0 +1,56 @@
//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[golangci] reported by reviewdog 🐶
package comment should be of the form "Package zap ..." (golint)

@@ -0,0 +1,189 @@
//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[golangci] reported by reviewdog 🐶
package comment should be of the form "Package zap ..." (golint)

}

// New returns a new logger instance.
func New(opts ...Option) (*logger, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[golangci] reported by reviewdog 🐶
exported func New returns unexported type *github.com/vdaas/vald/internal/log/zap.logger, which can be annoying to use (golint)

// New returns a new logger instance.
func New(opts ...Option) (*logger, error) {
l := new(logger)
for _, opt := range append(defaultOpts, opts...) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[golangci] reported by reviewdog 🐶
ranges should only be cuddled with assignments used in the iteration (wsl)

// See the License for the specific language governing permissions and
// limitations under the License.
//
package zap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[golangci] reported by reviewdog 🐶
package should be zap_test instead of zap (testpackage)

// See the License for the specific language governing permissions and
// limitations under the License.
//
package zap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[golangci] reported by reviewdog 🐶
package should be zap_test instead of zap (testpackage)

@rinx rinx force-pushed the feature/internal/add-access-log-grpc-interceptor branch from 0402bb6 to 1f66615 Compare December 8, 2020 08:31
@rinx rinx marked this pull request as draft January 12, 2021 07:11
@rinx rinx force-pushed the feature/internal/add-access-log-grpc-interceptor branch from f699079 to ee598eb Compare January 20, 2021 04:29
@rinx rinx marked this pull request as ready for review January 20, 2021 04:39
@rinx rinx requested review from hlts2 and kpango January 20, 2021 04:39
@rinx rinx force-pushed the feature/internal/add-access-log-grpc-interceptor branch from ad99d8b to df36f12 Compare January 20, 2021 06:31
Comment on lines +88 to +90
l.logger = zap.New(core, opts...)

l.sugar = l.logger.Sugar()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be define finelizer for logger & sugar.
zap has internal buffering we should call .Sync() method when application exits

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rinx added 8 commits January 20, 2021 16:12
Signed-off-by: Rintaro Okamura <[email protected]>

:white_check_mark: fix tests for logger type, format

Signed-off-by: Rintaro Okamura <[email protected]>
Signed-off-by: Rintaro Okamura <[email protected]>
Signed-off-by: Rintaro Okamura <[email protected]>

resp, err = handler(ctx, req)

latency := float64(time.Since(start)) / float64(time.Second)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to store time as unix nano second scale.
then we can calculate it on the monitoring query

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay i'll revise it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you!

rinx added 5 commits January 20, 2021 16:18
Signed-off-by: Rintaro Okamura <[email protected]>

:wrench: update Grafana dashboards & Prometheus config

Signed-off-by: Rintaro Okamura <[email protected]>

:pencil2: Use same rule for task naming

Signed-off-by: Rintaro Okamura <[email protected]>
Signed-off-by: Rintaro Okamura <[email protected]>
Signed-off-by: Rintaro Okamura <[email protected]>
Signed-off-by: Rintaro Okamura <[email protected]>
Copy link
Collaborator

@kpango kpango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Others LGTM

internal/net/grpc/interceptor/server/logging/accesslog.go Outdated Show resolved Hide resolved
entity.Error = err
log.Error("rpc completed", entity)
} else {
log.Info("rpc completed", entity)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think debug is better

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AccessLogInterceptor is disabled by default, so i think it is okay to use Info level here.
I think it is too many logs when turning on Debug logs (such as Agent's CreateIndex operation log). it is better to put access logs in different level compared to them.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a lot of access, it will be difficult to get other Info logs according to the number of accesses, so I think this Interceptor is used for Debug, I think Debug is fine.

The Info log should contain more important information, and I think it is too much to output every successful access.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The users of this AccessLogInterceptor will use Loki or something to query the aggregated logs. so there's no problem if there is a lot of accesses.
If access logs is Debug level, it is bigger problem that meaningless Debug level logs require large storage capacity to store them.

Also I think it is correct according to contributing guide. https://github.com/vdaas/vald/blob/master/docs/contributing/coding-style.md#logging

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, but I'd like to divide trace log and Info log, so how about make new log level called TRACE?


if err != nil {
entity.Error = err
log.Error("rpc completed", entity)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be you want print entity on error log right?
so, entity print should be %#v print pragma because of entity is pointer type , error log will show thier address only...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use log.Error instead of log.Errorf here because Zap logger prints the second argument as the "details" field.
https://pkg.go.dev/go.uber.org/zap#Logger.Info
Please see the description of this PR. There's an example.

So you know, there's a known problem (as described in the description) when using Glg logger.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the log.Error interface is independent of zap, it should not be implemented to work only with zap.
Even if we specify a standard Go logger here, if the group of arguments is the same interface, the address will be displayed as in Glg.
I think it would be more appropriate to consider a more explicit Print method, but I'm not sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm...
I don't know how to print correctly structured JSON logs with that format without using log.Error.
Okay, I'll close this PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, you don't need to close this pr.
let's discuss about it.

this pr feature is really effective.

@rinx rinx force-pushed the feature/internal/add-access-log-grpc-interceptor branch from df36f12 to 3e558e5 Compare January 20, 2021 07:25
@rinx rinx closed this Jan 20, 2021
@kpango kpango deleted the feature/internal/add-access-log-grpc-interceptor branch February 23, 2021 08:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants