feat(plugins/proxy-cache): add wildcard and parameter match support for content_type #10055

vm-001 · 2023-01-04T08:51:00Z

Summary

Add wildcard and parameter match support for content_type

Checklist

The Pull Request has tests
There's an entry in the CHANGELOG
There is a user-facing docs PR against https://github.com/Kong/docs.konghq.com - PUT DOCS PR HERE

Issue reference

FTI-1131

vm-001 · 2023-01-04T08:53:03Z

sister PR: #10030

windmgc · 2023-01-16T07:04:22Z

LGTM on the code wise, this needs to get merged after #10030

build/openresty/patches/ngx_lua-0.10.21_02-dyn_upstream_keepalive.patch

hbagdi · 2023-02-22T00:18:04Z

@vm-001 Could you please rebase?

kong/plugins/proxy-cache/handler.lua

hanshuebner · 2023-03-02T06:08:08Z

Does this PR overlap with #10374, @samugi ?

samugi · 2023-03-02T07:32:17Z

Does this PR overlap with #10374, @samugi ?

it will not create conflicts @hanshuebner because this PR is creating a new dedicated kong.tools.mime_type module and is only using it in the proxy cache plugin, it does not affect what we currently have in the pdk (moved in utils with #10374). However... we will want to unify things and only do this one way, it should be simple once we have this mime_type module to point to it from whatever function needs it in utils etc. Perhaps that should be a separate PR once these 2 are merged. What do you think?

EDIT: tracked in KAG-789

hanshuebner · 2023-03-02T07:34:31Z

Unification would be good. We don't want to have to implementations of basically the same thing. Please create a KAG ticket for visibility.

It has been confirmed that the lua-protobuf under version 0.4.2 causes the Segmentation fault(core dump) while used with lpeg. The issue was found in another PR(#10055).

Co-authored-by: Enrique García Cota <[email protected]>

…or content_type

fffonion · 2023-03-07T04:40:22Z

put a block on this PR as the original perf (#10030 (comment)) doesn't seem to prove that lpeg is faster than ngx.re

vm-001 · 2023-03-07T05:23:55Z

put a block on this PR as the original perf (#10030 (comment)) doesn't seem to prove that lpeg is faster than ngx.re

@fffonion The log of that test is reversed(due to the typo), I've just corrected that.

Here is the output from my computer:

case ngxreg: time used  : 519.99998092651 ms
case lpeg: time used  : 273.00000190735 ms

hanshuebner · 2023-03-07T06:35:47Z

put a block on this PR as the original perf (#10030 (comment)) doesn't seem to prove that lpeg is faster than ngx.re

@fffonion The log of that test is reversed(due to the typo), I've just corrected that.

Here is the output from my computer:
case ngxreg: time used  : 519.99998092651 ms
case lpeg: time used  : 273.00000190735 ms

The reason why lpeg is faster in this example program is that the regular expression to parse the parameters is evaluated multiple times, once for each parameter provided. If the whole matching is done by one regular expression evaluation, it is faster than lpeg. Let's not try to be too cheap when making arguments about performance.

The regular expression based version of the code also is easier to understand and reason about than the lpeg based version because it uses less named local state. Also, regular expressions are much more widely known than lpeg, which is a plus in terms of review- and maintainability.

vm-001 · 2023-03-07T07:00:13Z

put a block on this PR as the original perf (#10030 (comment)) doesn't seem to prove that lpeg is faster than ngx.re

@fffonion The log of that test is reversed(due to the typo), I've just corrected that.
Here is the output from my computer:
case ngxreg: time used  : 519.99998092651 ms
case lpeg: time used  : 273.00000190735 ms
The reason why lpeg is faster in this example program is that the regular expression to parse the parameters is evaluated multiple times, once for each parameter provided. If the whole matching is done by one regular expression evaluation, it is faster than lpeg. Let's not try to be too cheap when making arguments about performance.

The regular expression based version of the code also is easier to understand and reason about than the lpeg based version because it uses less named local state. Also, regular expressions are much more widely known than lpeg, which is a plus in terms of review- and maintainability.

Writing a single static regular expression able to match a content type(with multiple parameters) is almost impossible. If an example content type with parameters here is to be considered cheap. Then Feel free to remove (parameters/merge_params) and do another test.

- local media_type = (types/format_types) * (parameters/merge_params) * P(-1)
+ local media_type = (types/format_types) * P(-1)

The Regular expression is indeed easy to maintain compared to Lpeg, but the media type is something RFC and hasn't been changed for a decade. That's why I think using Lpeg for this scenario is acceptable.

hanshuebner · 2023-03-07T07:09:01Z

Writing a single static regular expression able to match a content type(with multiple parameters) is almost impossible.

I'm not sure what you mean by that. How about

(?<type>.+) \/ (?<subtype>[^ ;]+) (?:;\s*([^= ]+)=([^; ]+))*?

The performance benefit of the lpeg version is too small to warrant an implementation that is opaque and can't be reviewed unless one learns a non-standard parsing library. There is much more to be gained spending time learning how to use regular expressions.

vm-001 · 2023-03-07T07:35:38Z

Writing a single static regular expression able to match a content type(with multiple parameters) is almost impossible.

I'm not sure what you mean by that. How about

(?<type>.+) \/ (?<subtype>[^ ;]+) (?:;\s*([^= ]+)=([^; ]+))*?

The performance benefit of the lpeg version is too small to warrant an implementation that is opaque and can't be reviewed unless one learns a non-standard parsing library. There is much more to be gained spending time learning how to use regular expressions.

It's not working. Missing "charset = UTF-8"

local inspect = require("inspect")
local r = ngx.re.match("application/json;charset=UTF-8;Foo=Bar", [[(?<type>.+)\/(?<subtype>[^ ;]+)(?:;\s*([^= ]+)=([^; ]+))*]])
print(inspect(r))

--[[
{ "application", "json", "Foo", "Bar",
  [0] = "application/json;charset=UTF-8;Foo=Bar",
  subtype = "json",
  type = "application"
}
--]]

hanshuebner · 2023-03-07T07:41:11Z

That's true, and you're probably right in claiming that it is difficult to make it work in one regular expression. However, the overhead lies completely in the invocation of the regular expression parser, so reducing the number of regular expression evaluations is the key to improving performance. Regular expressions are easier to review and more universal than lpeg, and they're also faster to write, so the use of an obscure parsing library does not seem to be warranted, in particular given that the performance argument can be made both ways.

pull-request-size bot added the size/L label Jan 4, 2023

vm-001 added this to the 3.2 milestone Jan 4, 2023

github-actions bot added the plugins/proxy-cache label Jan 4, 2023

vm-001 mentioned this pull request Jan 4, 2023

feat(utils): add mime-type parsing utils #10030

Merged

3 tasks

vm-001 marked this pull request as ready for review January 4, 2023 08:55

vm-001 changed the base branch from master to feat/mime-type January 13, 2023 09:04

pull-request-size bot added size/XXL and removed size/L labels Jan 13, 2023

vm-001 changed the base branch from feat/mime-type to master January 13, 2023 09:05

pull-request-size bot added size/L and removed size/XXL labels Jan 13, 2023

windmgc changed the base branch from master to feat/mime-type January 13, 2023 10:01

pull-request-size bot added size/XXL and removed size/L labels Jan 13, 2023

windmgc force-pushed the feat/proxy-caching-content-type-wildcard branch from 47f3f08 to 4683619 Compare January 13, 2023 10:04

github-actions bot added changelog chore Not part of the core functionality of kong, but still needed core/pdk core/templates plugins/acme labels Jan 13, 2023

windmgc force-pushed the feat/proxy-caching-content-type-wildcard branch from 4683619 to 0042da5 Compare January 16, 2023 03:53

pull-request-size bot added size/L and removed size/XXL labels Jan 16, 2023

github-actions bot removed chore Not part of the core functionality of kong, but still needed plugins/acme changelog core/pdk core/templates labels Jan 16, 2023

vm-001 force-pushed the feat/proxy-caching-content-type-wildcard branch from 1ec0b59 to 39b66ad Compare February 2, 2023 11:43

locao reviewed Feb 2, 2023

View reviewed changes

build/openresty/patches/ngx_lua-0.10.21_02-dyn_upstream_keepalive.patch Outdated Show resolved Hide resolved

github-actions bot removed the build/bazel label Feb 2, 2023

vm-001 force-pushed the feat/proxy-caching-content-type-wildcard branch from 5b1f384 to 0883866 Compare February 6, 2023 06:04

vm-001 removed this from the 3.2 milestone Feb 6, 2023

vm-001 force-pushed the feat/proxy-caching-content-type-wildcard branch from 0883866 to 0fe260e Compare February 23, 2023 07:40

ghost reviewed Feb 23, 2023

View reviewed changes

kong/plugins/proxy-cache/handler.lua Show resolved Hide resolved

vm-001 mentioned this pull request Mar 2, 2023

chore(deps): bump lua-protobuf to 0.4.2 #10413

Merged

vm-001 and others added 2 commits March 2, 2023 16:50

feat(utils): add mime-type parsing utils (#10030)

d661579

Co-authored-by: Enrique García Cota <[email protected]>

feat(plugins/proxy-cache): add wildcard and parameter match support f…

298e5b0

…or content_type

vm-001 force-pushed the feat/proxy-caching-content-type-wildcard branch from 0fe260e to 298e5b0 Compare March 2, 2023 08:50

vm-001 added this to the 3.3 milestone Mar 2, 2023

vm-001 added 2 commits March 3, 2023 11:15

add lpeg dependency

f4207e8

Merge branch 'master' into feat/proxy-caching-content-type-wildcard

d95f4b4

fffonion added the pr/do not merge label Mar 7, 2023

fffonion removed the pr/do not merge label Mar 7, 2023

fffonion merged commit 2504033 into master Mar 7, 2023

fffonion deleted the feat/proxy-caching-content-type-wildcard branch March 7, 2023 05:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(plugins/proxy-cache): add wildcard and parameter match support for content_type #10055

feat(plugins/proxy-cache): add wildcard and parameter match support for content_type #10055

vm-001 commented Jan 4, 2023 •

edited by jira bot

Loading

vm-001 commented Jan 4, 2023

windmgc commented Jan 16, 2023

hbagdi commented Feb 22, 2023

hanshuebner commented Mar 2, 2023

samugi commented Mar 2, 2023 •

edited

Loading

hanshuebner commented Mar 2, 2023

fffonion commented Mar 7, 2023

vm-001 commented Mar 7, 2023 •

edited

Loading

hanshuebner commented Mar 7, 2023

vm-001 commented Mar 7, 2023 •

edited

Loading

hanshuebner commented Mar 7, 2023 •

edited

Loading

vm-001 commented Mar 7, 2023

hanshuebner commented Mar 7, 2023

feat(plugins/proxy-cache): add wildcard and parameter match support for content_type #10055

feat(plugins/proxy-cache): add wildcard and parameter match support for content_type #10055

Conversation

vm-001 commented Jan 4, 2023 • edited by jira bot Loading

Summary

Checklist

Issue reference

vm-001 commented Jan 4, 2023

windmgc commented Jan 16, 2023

hbagdi commented Feb 22, 2023

hanshuebner commented Mar 2, 2023

samugi commented Mar 2, 2023 • edited Loading

hanshuebner commented Mar 2, 2023

fffonion commented Mar 7, 2023

vm-001 commented Mar 7, 2023 • edited Loading

hanshuebner commented Mar 7, 2023

vm-001 commented Mar 7, 2023 • edited Loading

hanshuebner commented Mar 7, 2023 • edited Loading

vm-001 commented Mar 7, 2023

hanshuebner commented Mar 7, 2023

vm-001 commented Jan 4, 2023 •

edited by jira bot

Loading

samugi commented Mar 2, 2023 •

edited

Loading

vm-001 commented Mar 7, 2023 •

edited

Loading

vm-001 commented Mar 7, 2023 •

edited

Loading

hanshuebner commented Mar 7, 2023 •

edited

Loading