Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: development of Loggly logging plugin #6113

Merged
merged 14 commits into from
Jan 26, 2022

Conversation

bisakhmondal
Copy link
Member

@bisakhmondal bisakhmondal commented Jan 14, 2022

Signed-off-by: Bisakh Mondal [email protected]

What this PR does / why we need it:

#6112

Pre-submission checklist:

  • Did you explain what problem does this PR solve? Or what new features have been added?
  • Have you added corresponding test cases?
  • Have you modified the corresponding document?
  • Is this PR backward compatible? If it is not backward compatible, please discuss on the mailing list first

@bisakhmondal bisakhmondal added the enhancement New feature or request label Jan 14, 2022
@bisakhmondal
Copy link
Member Author

The current logger implementation
image

@membphis membphis requested a review from shuaijinchao January 15, 2022 09:24
local _M = { version = 0.1 }
local _M = {
version = 0.1,
severity = Severity
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why we need to update this code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one PR for one thing. so I think we need to revert this PR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for the code review and the concern. The current PR need to access the rfc5424 severity map to decode alert status into integer numbers.

local message = {
-- facility LOG_USER - random user level message
"<".. tostring(8 + severity[conf.severity]) .. ">1", -- 1
timestamp, -- timestamp
ctx.var.host or "-", -- hostname
"apisix", -- appname
ctx.var.pid, -- proc-id
"-", -- msgid
"[" .. conf.customer_token .. "@41058 " .. tab_concat(taglist, " ") .. "]",
json_str
}

I have made the field exported here from that module. Okay, reverting the changes back with a local copy of that severity map inside the plugin. Thank you

@juzhiyuan juzhiyuan requested a review from totemofwolf January 16, 2022 09:12
Signed-off-by: Bisakh Mondal <[email protected]>
Signed-off-by: Bisakh Mondal <[email protected]>
Signed-off-by: Bisakh Mondal <[email protected]>
Signed-off-by: Bisakh Mondal <[email protected]>
@bisakhmondal bisakhmondal marked this pull request as ready for review January 19, 2022 06:56
Signed-off-by: Bisakh Mondal <[email protected]>
Copy link
Member

@membphis membphis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to update the README.md

image

apisix/plugins/loggly.lua Outdated Show resolved Hide resolved
apisix/plugins/loggly.lua Outdated Show resolved Hide resolved
"debug", "info", "notice", "warning", "err", "crit", "alert", "emegr"},
description = "base severity log level",
},
include_req_body = {type = "boolean", default = false},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we refactor it to avoid adding the same group of fields to each logger?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And you miss to call log_util.collect_body(conf, ctx).

Copy link
Member Author

@bisakhmondal bisakhmondal Jan 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we refactor it to avoid adding the same group of fields to each logger?

Yes, similar to what we did to batchprocessor. Let's handle that in the next PR.

RE collect_body: Good catch. Adding, Thanks

entry = log_util.get_full_log(ngx, conf)
end

if conf.prefer_name then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we don't need this as we support customizing format?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

docs/en/latest/plugins/loggly.md Outdated Show resolved Hide resolved
apisix/plugins/loggly.lua Outdated Show resolved Hide resolved
local metadata = plugin.plugin_metadata(plugin_name)
local entry

if metadata and metadata.value.log_format
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's refactor it in the next PR

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@bisakhmondal
Copy link
Member Author

bisakhmondal commented Jan 20, 2022

need to update the README.md

Yes definitely. Do you want this to be handled in this PR or as usual a small PR with a readme update?
Thanks.

local function handle_log(entries)
local ok, err
for i = 1, #entries do
ok, err = send_data_over_udp(entries[i])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to check the return value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if in case, the batch processor fails to process the entries (network error or something like that) the batch will be retried with a retry policy.
ref:

function execute_func(premature, self, batch)
if premature then
return
end
local ok, err = self.func(batch.entries, self.batch_max_size)
if not ok then
core.log.error("Batch Processor[", self.name,
"] failed to process entries: ", err)
batch.retry_count = batch.retry_count + 1
if batch.retry_count <= self.max_retry_count then
schedule_func_exec(self, self.retry_delay,
batch)
else
core.log.error("Batch Processor[", self.name,"] exceeded ",
"the max_retry_count[", batch.retry_count,
"] dropping the entries")
end
return
end
core.log.debug("Batch Processor[", self.name,
"] successfully processed the entries")
end

However, seeing your comment make me think that we can migrate to a fail-fast approach. It has other benefits.

    for i = 1, #entries do
        local ok, err = send_data_over_udp(entries[i])
        if not ok then
            return false, err
        end
    end
    return true

Thank you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look like we need to remove the sent entries? Otherwise, they will be retried.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's an issue here. The way batchprocessor is written, it is meant for consuming each batch in one go (transactional). But in the case of loggly, only bulk events can be sent through HTTP endpoints. So I can think of three potential resolution

  1. if err and i==1 -> then return false else true | if some entries has been sent that's ok. we are loosing the rest of the entries. [some entries will be dropped]
  2. use the current behaviour. where some entries might be retried again [duplocation].
  3. Update batch processor to have this fine-grained control
    function execute_func(premature, self, batch)
    if premature then
    return
    end
    local ok, err = self.func(batch.entries, self.batch_max_size)
    if not ok then
    core.log.error("Batch Processor[", self.name,
    "] failed to process entries: ", err)
    batch.retry_count = batch.retry_count + 1
    if batch.retry_count <= self.max_retry_count then
    schedule_func_exec(self, self.retry_delay,
    batch)
    else
    core.log.error("Batch Processor[", self.name,"] exceeded ",
    "the max_retry_count[", batch.retry_count,
    "] dropping the entries")
    end
    return
    end
    core.log.debug("Batch Processor[", self.name,
    "] successfully processed the entries")
    end

update

local ok, err = self.func(batch.entries, self.batch_max_size)

-- to this
local ok, err, n = self.func(batch.entries, self.batch_max_size)
 -- shorten self.entries based on n

which one do you think would be good? cc @spacewander

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here n denotes the number of processed entries, and the rest of the entries should be retried.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spacewander
Copy link
Member

need to update the README.md

Yes definitely. Do you want this to be handled in this PR or as usual a small PR with a readme update? Thanks.

I suggest doing it in the next PR.

--- grep_error_log eval
qr/message received: [ -~]+/
--- grep_error_log_out eval
qr/message received: <10>1 [\d\-T:.]+Z [\d.]+ apisix [\d]+ - \[token-1\@41058 ] \{"apisix_latency":[\d.]*,"client_ip":"127\.0\.0\.1","latency":[\d.]*,"request":\{"headers":\{"content-type":"application\/x-www-form-urlencoded","host":"127\.0\.0\.1:1984","user-agent":"lua-resty-http\/[\d.]* \(Lua\) ngx_lua\/[\d]*"\},"method":"GET","querystring":\{\},"size":[\d]+,"uri":"\/opentracing","url":"http:\/\/127\.0\.0\.1:1984\/opentracing"\},"response":\{"headers":\{"connection":"close","content-type":"text\/plain","server":"APISIX\/[\d.]+","transfer-encoding":"chunked"\},"size":[\d]*,"status":200\},"route_id":"1","server":\{"hostname":"[ -~]*","version":"[\d.]+"\},"service_id":"","start_time":[\d]*,"upstream":"127\.0\.0\.1:1982","upstream_latency":[\d]*\}/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to configure include_resp_body_expr & include_resp_body to get the body.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also added case for metadata.log_format

@spacewander spacewander merged commit 53420f6 into apache:master Jan 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants