New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[APPSEC-10967] ASM parse response body #3153

Merged

GustavoCaso merged 9 commits into master from asm-response-body-waf-address

Oct 23, 2023

Member

GustavoCaso commented Sep 25, 2023 •

edited

Loading

What does this PR do?

Parsed the response body and passed it to the waf as server.response.body.

We only parse bodies that are either an Array or a Rack::BodyProxy. Since we might not be the last middleware in the customer application, we can not consume the response body directly by calling each. To circumvent that, we make a copy of the body.

Parsing the response body could lead to performance implications for our customers. Since we have yet to learn how this would impact our customers, I added a configuration entry for them to skip the response body parsing altogether.

This documentation would remain undocumented, and we would only mention it to customers if they experience any performance degradation.

Motivation:

Additional Notes:

How to test the change?

For Datadog employees:

If this PR touches code that signs or publishes builds or packages, or handles
credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
This PR doesn't touch any of that.

Unsure? Have a question? Request a review!

github-actions bot added appsec integrations labels

codecov-commenter commented Sep 25, 2023

Codecov Report

Merging #3153 (22106fc) into master (1511f99) will increase coverage by 0.00%.
Report is 44 commits behind head on master.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master    #3153   +/-   ##
=======================================
  Coverage   98.16%   98.16%           
=======================================
  Files        1283     1283           
  Lines       73915    74000   +85     
  Branches     3425     3433    +8     
=======================================
+ Hits        72559    72645   +86     
+ Misses       1356     1355    -1

Files Changed	Coverage Δ
...ib/datadog/appsec/contrib/rack/gateway/response.rb	`100.00% <100.00%> (ø)`
...b/datadog/appsec/contrib/rack/reactive/response.rb	`100.00% <100.00%> (ø)`
...tadog/appsec/contrib/rack/gateway/response_spec.rb	`100.00% <100.00%> (ø)`
...adog/appsec/contrib/rack/reactive/response_spec.rb	`100.00% <100.00%> (ø)`

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

GustavoCaso force-pushed the asm-response-body-waf-address branch from 22106fc to 9ed1001 Compare

September 25, 2023 13:10

GustavoCaso marked this pull request as ready for review

September 25, 2023 13:31

GustavoCaso requested a review from a team as a code owner

September 25, 2023 13:31

GustavoCaso changed the title ~~Asm response body waf address~~ [APPSEC-10967] ASM parse response body

lloeki approved these changes

View reviewed changes

Member

lloeki left a comment

Overall ok, but I'm concerned about the fuzziness around content type conditions.

lib/datadog/appsec/configuration/settings.rb Outdated

Comment on lines 192 to 196

+                            option :parse_response_body do |o|
+                              o.type :bool
+                              o.default true
+                            end

Member

lloeki Sep 25, 2023

Very good move! I was going to suggest doing that but you outran me!

lib/datadog/appsec/contrib/rack/gateway/response.rb Outdated

Comment on lines 26 to 28

+                              if result.timeout
+                                Datadog.logger.debug do
+                                  "Unable to parse response body because of unsupported body type: #{body.class}"
+                                end
+                              end

Member

lloeki Sep 25, 2023

Hmm, that doesn't seem like the intended log message there.

lib/datadog/appsec/contrib/rack/gateway/response.rb Outdated

+                            end
+                            return unless supported_response_type
+                            body_dup = body.dup # avoid interating over the body. This is just in case code.

Member

lloeki Sep 25, 2023

For Array this is not needed, so is that for Rack::BodyProxy?

Member Author

GustavoCaso Sep 25, 2023

Yes.

Since Rack::BodyProxy acts as a proxy. If we call each directly, we might iterate over the body and consume it.

https://github.com/rack/rack/blob/main/lib/rack/body_proxy.rb#L45-L58

If we call to to_ary we also close the body, and we do not want that, since we might not be the last middleware. That is why I thought using dup would save us here

Member

lloeki Sep 26, 2023

But then you're duping the Rack::BodyProxy instance, which would still point to the same underlying wrapped object, which we don't know the type of, and may not be traversable twice.

lib/datadog/appsec/contrib/rack/gateway/response.rb Outdated

+                          end
+                          def json?
+                            headers['content-type'].include?('json')

Member

lloeki Sep 25, 2023

I'd rather be strict and explicitly list supported content types. I think those would be:

application/json (the official one registered at IANA)
text/json because someone from the team added it? I don't think I've seen it in the (Ruby) wild, ever.

lib/datadog/appsec/contrib/rack/gateway/response.rb Outdated

+                          end
+                          def text?
+                            headers['content-type'].include?('text')

Member

lloeki Sep 25, 2023

I'd rather be strict and explicitly list supported content types.

Indeed as is this would match at least text/plain and text/html but is there any point of passing those raw to libddwaf in the non-raw address that aims to carry parsed, structured data?

Is it for text/xml, which according to RFC 3023 says:

If an XML document -- that is, the unprocessed, source XML document
-- is readable by casual users, text/xml is preferable to
application/xml. MIME user agents (and web user agents) that do not
have explicit support for text/xml will treat it as text/plain, for
example, by displaying the XML MIME entity as plain text.
Application/xml is preferable when the XML MIME entity is unreadable
by casual users.

Member Author

GustavoCaso Sep 25, 2023

You are right, since we do not parse, there is no schema information to extract from it

I will remove the entire support for text/* content-type

lib/datadog/appsec/contrib/rack/gateway/response.rb

+                            return unless all_body_parts_are_string
+                            if json?
+                              JSON.parse(result)

Member

lloeki Sep 25, 2023

That json? tests that folks have been advertising the body as being JSON, but if content is broken JSON it'd blow up with a JSON::ParserError.

We should guard against that and return.

lib/datadog/appsec/contrib/rack/gateway/response.rb Outdated

+                            if json?
+                              JSON.parse(result)
+                            else
+                              result

Member

lloeki Sep 25, 2023

Then it's not really parsed, is it? Either it's raw (server.response.body.raw) or it's parsed (server.response.body).

Since the goal is to parse it - notably to perform schema extraction - it seems to me there's not much point doing that.

GustavoCaso commented

View reviewed changes

lib/datadog/appsec/contrib/rack/gateway/response.rb Outdated

+                            return unless Datadog.configuration.appsec.parse_response_body
+                            unless body.instance_of?(Array) || body.instance_of?(::Rack::BodyProxy)
+                              if result.timeout

Member Author

GustavoCaso Sep 25, 2023

I have no idea how that came to be there :weird:

GustavoCaso force-pushed the asm-response-body-waf-address branch 2 times, most recently from 62d7be1 to 0a2d379 Compare

September 27, 2023 13:26

GustavoCaso added 6 commits

October 19, 2023 15:17


          Parse response body information and propagate it to the WAF.

039f14f


          Add support for Rack::BodyProxy

6ef1c90


          enable a way to disable parse response body

0b74e23


          only supoprt content-type */josn when parsing response body

4b3eaf1


          Make sure to call :to_ary on the response body before iterating over it

b195274

using :each


          Make sure to only replace the response body if is not empty

439a736

GustavoCaso force-pushed the asm-response-body-waf-address branch 2 times, most recently from bcd6e90 to 2a36a14 Compare

October 23, 2023 10:31

GustavoCaso added 3 commits

October 23, 2023 13:49


          make sure that parsing the response body is disable by default

0a85874


          Add the ASM API security scenarios to the benchmark pipeline

7c86a43


          fix typo

72ace59

GustavoCaso force-pushed the asm-response-body-waf-address branch from 2a36a14 to 72ace59 Compare

October 23, 2023 11:50

GustavoCaso merged commit f725c43 into master

216 of 217 checks passed

GustavoCaso deleted the asm-response-body-waf-address branch

October 23, 2023 12:39

GustavoCaso mentioned this pull request

Backport ASM API security parse response body #3224

Closed

2 tasks

marcotc added this to the 1.16.0 milestone

GustavoCaso restored the asm-response-body-waf-address branch

November 10, 2023 14:22

GustavoCaso added a commit that referenced this pull request


          revert #3153

e7cd977

GustavoCaso mentioned this pull request

revert https://github.com/DataDog/dd-trace-rb/pull/3153 #3252

Merged

2 tasks

GustavoCaso added a commit that referenced this pull request


          revert #3153

93957a0

ekump pushed a commit that referenced this pull request


          revert #3153 (#3252)

92e379a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

appsec integrations