Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-26118][Web UI] Introducing spark.ui.requestHeaderSize for setting HTTP requestHeaderSize #23090

Closed
wants to merge 3 commits into from

Conversation

attilapiros
Copy link
Contributor

@attilapiros attilapiros commented Nov 19, 2018

What changes were proposed in this pull request?

Introducing spark.ui.requestHeaderSize for configuring Jetty's HTTP requestHeaderSize.
This way long authorization field does not lead to HTTP 431.

How was this patch tested?

Manually with curl (which version must be at least 7.55).

With the original default value (8k limit):

# Starting history server with default requestHeaderSize
$ ./sbin/start-history-server.sh
starting org.apache.spark.deploy.history.HistoryServer, logging to /Users/attilapiros/github/spark/logs/spark-attilapiros-org.apache.spark.deploy.history.HistoryServer-1-apiros-MBP.lan.out

# Creating huge header
$ echo -n "X-Custom-Header: " > cookie
$ printf 'A%.0s' {1..9500} >> cookie

# HTTP GET with huge header fails with 431 
$ curl  -H @cookie http://[email protected]:18080/
<h1>Bad Message 431</h1><pre>reason: Request Header Fields Too Large</pre> 

# The log contains the error
$ tail -1 /Users/attilapiros/github/spark/logs/spark-attilapiros-org.apache.spark.deploy.history.HistoryServer-1-apiros-MBP.lan.out
18/11/19 21:24:28 WARN HttpParser: Header is too large 8193>8192

After:

# Creating the history properties file with the increased requestHeaderSize
$ echo spark.ui.requestHeaderSize=10000 > history.properties

# Starting Spark History Server with the settings
$ ./sbin/start-history-server.sh --properties-file history.properties
starting org.apache.spark.deploy.history.HistoryServer, logging to /Users/attilapiros/github/spark/logs/spark-attilapiros-org.apache.spark.deploy.history.HistoryServer-1-apiros-MBP.lan.out

# HTTP GET with huge header gives back HTML5 (I have added here only just a part of the response)
$ curl  -H @cookie http://[email protected]:18080/
<!DOCTYPE html><html>
      <head>...
         <link rel="shortcut icon" href="/static/spark-logo-77x50px-hd.png"></link>
        <title>History Server</title>
      </head>
      <body>
...

@attilapiros attilapiros changed the title [SPARK-26118][UI] Introducing spark.ui.requestHeaderSize for setting HTTP requestHeaderSize [SPARK-26118][Web UII] Introducing spark.ui.requestHeaderSize for setting HTTP requestHeaderSize Nov 19, 2018
@attilapiros attilapiros changed the title [SPARK-26118][Web UII] Introducing spark.ui.requestHeaderSize for setting HTTP requestHeaderSize [SPARK-26118][Web UI] Introducing spark.ui.requestHeaderSize for setting HTTP requestHeaderSize Nov 19, 2018
Copy link
Contributor

@squito squito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just a small comment on the doc

docs/configuration.md Outdated Show resolved Hide resolved
@SparkQA
Copy link

SparkQA commented Nov 20, 2018

Test build #99018 has finished for PR 23090 at commit c6278be.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 20, 2018

Test build #99019 has finished for PR 23090 at commit 81334f4.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 Surprisingly I crafted quite similar patch in couple of days ago (not submitted yet on Apache side). LGTM.

@SparkQA
Copy link

SparkQA commented Nov 20, 2018

Test build #99023 has finished for PR 23090 at commit a343f93.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@gengliangwang gengliangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SparkQA
Copy link

SparkQA commented Nov 20, 2018

Test build #4434 has finished for PR 23090 at commit a343f93.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@gaborgsomogyi gaborgsomogyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

asfgit pushed a commit that referenced this pull request Nov 20, 2018
…ing HTTP requestHeaderSize

## What changes were proposed in this pull request?

Introducing spark.ui.requestHeaderSize for configuring Jetty's HTTP requestHeaderSize.
This way long authorization field does not lead to HTTP 413.

## How was this patch tested?

Manually with curl (which version must be at least 7.55).

With the original default value (8k limit):

```bash
# Starting history server with default requestHeaderSize
$ ./sbin/start-history-server.sh
starting org.apache.spark.deploy.history.HistoryServer, logging to /Users/attilapiros/github/spark/logs/spark-attilapiros-org.apache.spark.deploy.history.HistoryServer-1-apiros-MBP.lan.out

# Creating huge header
$ echo -n "X-Custom-Header: " > cookie
$ printf 'A%.0s' {1..9500} >> cookie

# HTTP GET with huge header fails with 431
$ curl  -H cookie http://458apiros-MBP.lan:18080/
<h1>Bad Message 431</h1><pre>reason: Request Header Fields Too Large</pre>

# The log contains the error
$ tail -1 /Users/attilapiros/github/spark/logs/spark-attilapiros-org.apache.spark.deploy.history.HistoryServer-1-apiros-MBP.lan.out
18/11/19 21:24:28 WARN HttpParser: Header is too large 8193>8192
```

After:

```bash
# Creating the history properties file with the increased requestHeaderSize
$ echo spark.ui.requestHeaderSize=10000 > history.properties

# Starting Spark History Server with the settings
$ ./sbin/start-history-server.sh --properties-file history.properties
starting org.apache.spark.deploy.history.HistoryServer, logging to /Users/attilapiros/github/spark/logs/spark-attilapiros-org.apache.spark.deploy.history.HistoryServer-1-apiros-MBP.lan.out

# HTTP GET with huge header gives back HTML5 (I have added here only just a part of the response)
$ curl  -H cookie http://458apiros-MBP.lan:18080/
<!DOCTYPE html><html>
      <head>...
         <link rel="shortcut icon" href="/static/spark-logo-77x50px-hd.png"></link>
        <title>History Server</title>
      </head>
      <body>
...
```

Closes #23090 from attilapiros/JettyHeaderSize.

Authored-by: “attilapiros” <[email protected]>
Signed-off-by: Imran Rashid <[email protected]>
(cherry picked from commit ab61ddb)
Signed-off-by: Imran Rashid <[email protected]>
@squito
Copy link
Contributor

squito commented Nov 20, 2018

merged to master / 2.4

@asfgit asfgit closed this in ab61ddb Nov 20, 2018
@dongjoon-hyun
Copy link
Member

Hi, @squito .
SPARK-26118 is marked as 'Improvement', we don't backport 'Improvement'.

@attilapiros
Copy link
Contributor Author

@dongjoon-hyun, @squito it is my bad, this problem is present in earlier releases as well. If I can change the issue type I modify it to Bug.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Nov 20, 2018

I understand the situation. However, according to the content of the patch, it's an improvement to add a missing configuration, @attilapiros .

httpConfig.setRequestHeaderSize(conf.get(UI_REQUEST_HEADER_SIZE).toInt)

@rxin , @gatorsmile , @srowen . How do you think about this? If PMC allows this, I'd like to have this in the older branches, too. Otherwise, we need to revert this from branch-2.4.

@srowen
Copy link
Member

srowen commented Nov 20, 2018

Why not just hard-code a much higher limit? what's the consequence? in a generic public web server I can see that this defends against malicious or malformed requests, but that's much less an issue for an internal Spark server.

That is, would a user really need to vary this up and down and know enough to find this config?

@squito
Copy link
Contributor

squito commented Nov 20, 2018

I think the default is pretty reasonable in most cases, in that this is the first time we've heard of someone hitting this limit. I'm not sure how high we would make it to get around this problem in general. And from the jetty docs: https://www.eclipse.org/jetty/javadoc/current/org/eclipse/jetty/server/HttpConfiguration.html#setRequestHeaderSize-int-

Larger headers will allow for more and/or larger cookies plus larger form content encoded in a URL. However, larger headers consume more memory and can make a server more vulnerable to denial of service attacks.

@attilapiros
Copy link
Contributor Author

I would prefer to override this 8k limit when it is really necessary (and only with the extent which justified by the production system).

@squito
Copy link
Contributor

squito commented Nov 20, 2018

btw I agree this was bad judgement on my part to only backport to 2.4, sorry abotu that and thanks for catching @dongjoon-hyun . I do think this fixes a bug (the ability to use the UI when you are a member of many user groups) and so should be backported further, though I would really be fine either way.

@dongjoon-hyun
Copy link
Member

Thank you, @squito , @attilapiros , @srowen .

Then, @attilapiros , could you send backporting PRs against branch-2.3 and branch-2.2 please?

@attilapiros
Copy link
Contributor Author

Thanks @dongjoon-hyun, of course I will create those PRs.

attilapiros added a commit to attilapiros/spark that referenced this pull request Nov 22, 2018
…ing HTTP requestHeaderSize

Introducing spark.ui.requestHeaderSize for configuring Jetty's HTTP requestHeaderSize.
This way long authorization field does not lead to HTTP 413.

Manually with curl (which version must be at least 7.55).

With the original default value (8k limit):

```bash
$ ./sbin/start-history-server.sh
starting org.apache.spark.deploy.history.HistoryServer, logging to /Users/attilapiros/github/spark/logs/spark-attilapiros-org.apache.spark.deploy.history.HistoryServer-1-apiros-MBP.lan.out

$ echo -n "X-Custom-Header: " > cookie
$ printf 'A%.0s' {1..9500} >> cookie

$ curl  -H cookie http://458apiros-MBP.lan:18080/
<h1>Bad Message 431</h1><pre>reason: Request Header Fields Too Large</pre>

$ tail -1 /Users/attilapiros/github/spark/logs/spark-attilapiros-org.apache.spark.deploy.history.HistoryServer-1-apiros-MBP.lan.out
18/11/19 21:24:28 WARN HttpParser: Header is too large 8193>8192
```

After:

```bash
$ echo spark.ui.requestHeaderSize=10000 > history.properties

$ ./sbin/start-history-server.sh --properties-file history.properties
starting org.apache.spark.deploy.history.HistoryServer, logging to /Users/attilapiros/github/spark/logs/spark-attilapiros-org.apache.spark.deploy.history.HistoryServer-1-apiros-MBP.lan.out

$ curl  -H cookie http://458apiros-MBP.lan:18080/
<!DOCTYPE html><html>
      <head>...
         <link rel="shortcut icon" href="/static/spark-logo-77x50px-hd.png"></link>
        <title>History Server</title>
      </head>
      <body>
...
```

Closes apache#23090 from attilapiros/JettyHeaderSize.

Authored-by: “attilapiros” <[email protected]>
Signed-off-by: Imran Rashid <[email protected]>
(cherry picked from commit ab61ddb)
@attilapiros
Copy link
Contributor Author

Backport PR for 2.3: #23114

@attilapiros
Copy link
Contributor Author

Backport PR for 2.2: #23115

Both 2.2 and 2.3 are retested.

jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
…ing HTTP requestHeaderSize

## What changes were proposed in this pull request?

Introducing spark.ui.requestHeaderSize for configuring Jetty's HTTP requestHeaderSize.
This way long authorization field does not lead to HTTP 413.

## How was this patch tested?

Manually with curl (which version must be at least 7.55).

With the original default value (8k limit):

```bash
# Starting history server with default requestHeaderSize
$ ./sbin/start-history-server.sh
starting org.apache.spark.deploy.history.HistoryServer, logging to /Users/attilapiros/github/spark/logs/spark-attilapiros-org.apache.spark.deploy.history.HistoryServer-1-apiros-MBP.lan.out

# Creating huge header
$ echo -n "X-Custom-Header: " > cookie
$ printf 'A%.0s' {1..9500} >> cookie

# HTTP GET with huge header fails with 431
$ curl  -H cookie http://458apiros-MBP.lan:18080/
<h1>Bad Message 431</h1><pre>reason: Request Header Fields Too Large</pre>

# The log contains the error
$ tail -1 /Users/attilapiros/github/spark/logs/spark-attilapiros-org.apache.spark.deploy.history.HistoryServer-1-apiros-MBP.lan.out
18/11/19 21:24:28 WARN HttpParser: Header is too large 8193>8192
```

After:

```bash
# Creating the history properties file with the increased requestHeaderSize
$ echo spark.ui.requestHeaderSize=10000 > history.properties

# Starting Spark History Server with the settings
$ ./sbin/start-history-server.sh --properties-file history.properties
starting org.apache.spark.deploy.history.HistoryServer, logging to /Users/attilapiros/github/spark/logs/spark-attilapiros-org.apache.spark.deploy.history.HistoryServer-1-apiros-MBP.lan.out

# HTTP GET with huge header gives back HTML5 (I have added here only just a part of the response)
$ curl  -H cookie http://458apiros-MBP.lan:18080/
<!DOCTYPE html><html>
      <head>...
         <link rel="shortcut icon" href="/static/spark-logo-77x50px-hd.png"></link>
        <title>History Server</title>
      </head>
      <body>
...
```

Closes apache#23090 from attilapiros/JettyHeaderSize.

Authored-by: “attilapiros” <[email protected]>
Signed-off-by: Imran Rashid <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request Jul 3, 2019
## What changes were proposed in this pull request?

`requestHeaderSize` is added in #23090 and applies to Spark + History server UI as well. Without debug log it's hard to find out on which side what configuration is used.

In this PR I've added a log message which prints out the value.

## How was this patch tested?

Manually checked log files.

Closes #25045 from gaborgsomogyi/SPARK-26118.

Authored-by: Gabor Somogyi <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 0b6c2c2)
Signed-off-by: Dongjoon Hyun <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request Jul 3, 2019
## What changes were proposed in this pull request?

`requestHeaderSize` is added in #23090 and applies to Spark + History server UI as well. Without debug log it's hard to find out on which side what configuration is used.

In this PR I've added a log message which prints out the value.

## How was this patch tested?

Manually checked log files.

Closes #25045 from gaborgsomogyi/SPARK-26118.

Authored-by: Gabor Somogyi <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 0b6c2c2)
Signed-off-by: Dongjoon Hyun <[email protected]>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Jul 23, 2019
…ing HTTP requestHeaderSize

## What changes were proposed in this pull request?

Introducing spark.ui.requestHeaderSize for configuring Jetty's HTTP requestHeaderSize.
This way long authorization field does not lead to HTTP 413.

## How was this patch tested?

Manually with curl (which version must be at least 7.55).

With the original default value (8k limit):

```bash
# Starting history server with default requestHeaderSize
$ ./sbin/start-history-server.sh
starting org.apache.spark.deploy.history.HistoryServer, logging to /Users/attilapiros/github/spark/logs/spark-attilapiros-org.apache.spark.deploy.history.HistoryServer-1-apiros-MBP.lan.out

# Creating huge header
$ echo -n "X-Custom-Header: " > cookie
$ printf 'A%.0s' {1..9500} >> cookie

# HTTP GET with huge header fails with 431
$ curl  -H cookie http://458apiros-MBP.lan:18080/
<h1>Bad Message 431</h1><pre>reason: Request Header Fields Too Large</pre>

# The log contains the error
$ tail -1 /Users/attilapiros/github/spark/logs/spark-attilapiros-org.apache.spark.deploy.history.HistoryServer-1-apiros-MBP.lan.out
18/11/19 21:24:28 WARN HttpParser: Header is too large 8193>8192
```

After:

```bash
# Creating the history properties file with the increased requestHeaderSize
$ echo spark.ui.requestHeaderSize=10000 > history.properties

# Starting Spark History Server with the settings
$ ./sbin/start-history-server.sh --properties-file history.properties
starting org.apache.spark.deploy.history.HistoryServer, logging to /Users/attilapiros/github/spark/logs/spark-attilapiros-org.apache.spark.deploy.history.HistoryServer-1-apiros-MBP.lan.out

# HTTP GET with huge header gives back HTML5 (I have added here only just a part of the response)
$ curl  -H cookie http://458apiros-MBP.lan:18080/
<!DOCTYPE html><html>
      <head>...
         <link rel="shortcut icon" href="/static/spark-logo-77x50px-hd.png"></link>
        <title>History Server</title>
      </head>
      <body>
...
```

Closes apache#23090 from attilapiros/JettyHeaderSize.

Authored-by: “attilapiros” <[email protected]>
Signed-off-by: Imran Rashid <[email protected]>
(cherry picked from commit ab61ddb)
Signed-off-by: Imran Rashid <[email protected]>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Aug 1, 2019
…ing HTTP requestHeaderSize

## What changes were proposed in this pull request?

Introducing spark.ui.requestHeaderSize for configuring Jetty's HTTP requestHeaderSize.
This way long authorization field does not lead to HTTP 413.

## How was this patch tested?

Manually with curl (which version must be at least 7.55).

With the original default value (8k limit):

```bash
# Starting history server with default requestHeaderSize
$ ./sbin/start-history-server.sh
starting org.apache.spark.deploy.history.HistoryServer, logging to /Users/attilapiros/github/spark/logs/spark-attilapiros-org.apache.spark.deploy.history.HistoryServer-1-apiros-MBP.lan.out

# Creating huge header
$ echo -n "X-Custom-Header: " > cookie
$ printf 'A%.0s' {1..9500} >> cookie

# HTTP GET with huge header fails with 431
$ curl  -H cookie http://458apiros-MBP.lan:18080/
<h1>Bad Message 431</h1><pre>reason: Request Header Fields Too Large</pre>

# The log contains the error
$ tail -1 /Users/attilapiros/github/spark/logs/spark-attilapiros-org.apache.spark.deploy.history.HistoryServer-1-apiros-MBP.lan.out
18/11/19 21:24:28 WARN HttpParser: Header is too large 8193>8192
```

After:

```bash
# Creating the history properties file with the increased requestHeaderSize
$ echo spark.ui.requestHeaderSize=10000 > history.properties

# Starting Spark History Server with the settings
$ ./sbin/start-history-server.sh --properties-file history.properties
starting org.apache.spark.deploy.history.HistoryServer, logging to /Users/attilapiros/github/spark/logs/spark-attilapiros-org.apache.spark.deploy.history.HistoryServer-1-apiros-MBP.lan.out

# HTTP GET with huge header gives back HTML5 (I have added here only just a part of the response)
$ curl  -H cookie http://458apiros-MBP.lan:18080/
<!DOCTYPE html><html>
      <head>...
         <link rel="shortcut icon" href="/static/spark-logo-77x50px-hd.png"></link>
        <title>History Server</title>
      </head>
      <body>
...
```

Closes apache#23090 from attilapiros/JettyHeaderSize.

Authored-by: “attilapiros” <[email protected]>
Signed-off-by: Imran Rashid <[email protected]>
(cherry picked from commit ab61ddb)
Signed-off-by: Imran Rashid <[email protected]>
rluta pushed a commit to rluta/spark that referenced this pull request Sep 17, 2019
## What changes were proposed in this pull request?

`requestHeaderSize` is added in apache#23090 and applies to Spark + History server UI as well. Without debug log it's hard to find out on which side what configuration is used.

In this PR I've added a log message which prints out the value.

## How was this patch tested?

Manually checked log files.

Closes apache#25045 from gaborgsomogyi/SPARK-26118.

Authored-by: Gabor Somogyi <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 0b6c2c2)
Signed-off-by: Dongjoon Hyun <[email protected]>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Sep 26, 2019
## What changes were proposed in this pull request?

`requestHeaderSize` is added in apache#23090 and applies to Spark + History server UI as well. Without debug log it's hard to find out on which side what configuration is used.

In this PR I've added a log message which prints out the value.

## How was this patch tested?

Manually checked log files.

Closes apache#25045 from gaborgsomogyi/SPARK-26118.

Authored-by: Gabor Somogyi <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 0b6c2c2)
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants