Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-13184][SQL] Add a datasource-specific option minPartitions in HadoopFsRelation#options #13320

Closed
wants to merge 4 commits into from

Conversation

maropu
Copy link
Member

@maropu maropu commented May 26, 2016

What changes were proposed in this pull request?

This pr adds a new option minPartitions accepted in HadoopFsRelation#options.
This is a datasource-specific value of the minimal splitting number for input data.

How was this patch tested?

Added unit tests in FileSourceStrategySuite.

@rxin
Copy link
Contributor

rxin commented May 26, 2016

I was thinking actually just use the same config key name, and override it. Can you read my comment on the jira ticket and let me know if that's clear?

@maropu
Copy link
Member Author

maropu commented May 26, 2016

oh, sorry. I'll check and fix it.

@SparkQA
Copy link

SparkQA commented May 26, 2016

Test build #59353 has finished for PR 13320 at commit dbee9b3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented May 26, 2016

@rxin Could you check this again to satisfy your intention?
Also, I'll make the description up-to-date later.

@SparkQA
Copy link

SparkQA commented May 26, 2016

Test build #59388 has finished for PR 13320 at commit e703f48.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented May 28, 2016

I'm going to mark this for 2.1 given we are too close to 2.0 to modify behaviors like this.

@maropu
Copy link
Member Author

maropu commented May 29, 2016

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented May 29, 2016

Test build #59586 has finished for PR 13320 at commit e703f48.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 30, 2016

Test build #61555 has finished for PR 13320 at commit e28586b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -67,9 +67,6 @@ private[sql] abstract class BaseWriterContainer(
// This is only used on driver side.
@transient private val jobContext: JobContext = job

private val speculationEnabled: Boolean =
relation.sparkSession.sparkContext.conf.getBoolean("spark.speculation", defaultValue = false)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted because this line not used.

@SparkQA
Copy link

SparkQA commented Jun 30, 2016

Test build #61566 has finished for PR 13320 at commit c77bd36.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 1, 2016

Test build #61577 has finished for PR 13320 at commit abc0054.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 18, 2016

Test build #62473 has finished for PR 13320 at commit abc0054.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 19, 2016

Test build #64052 has finished for PR 13320 at commit 6440370.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 19, 2016

Test build #64077 has finished for PR 13320 at commit ef06b82.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Aug 22, 2016

@rxin could you give me comments?

@SparkQA
Copy link

SparkQA commented Nov 18, 2016

Test build #68863 has finished for PR 13320 at commit ba3f2a0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member Author

maropu commented Mar 3, 2017

@gatorsmile Could you check this and give me comments, too?

@SparkQA
Copy link

SparkQA commented Jun 12, 2017

Test build #77932 has finished for PR 13320 at commit ba3f2a0.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

// Returns all option set in the `SparkConf`, the `SQLConf`, and a given data source `options`.
// If the same keys exist, they are overridden with ones in the `options`.
private def optionsOverriddenWith(options: Map[String, String]): Map[String, String] = {
sparkSession.sparkContext.conf.getAllAsMap ++ sparkSession.conf.getAll ++ options
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we just cover the code path for DataFrameReader APIs.

@gatorsmile
Copy link
Member

To fundamentally resolve this issue, we need to consider all the read/write paths. A lot of works to do for achieving it. Do you want to continue the work or just close it?

@maropu
Copy link
Member Author

maropu commented Jun 13, 2017

Ah, ok. I'll check again.

@HyukjinKwon
Copy link
Member

@maropu, I just wonder if it is in progress in any way.

@maropu
Copy link
Member Author

maropu commented Jul 24, 2017

No, so I'll close this for now and we move this discussion to jira.

@maropu maropu closed this Jul 24, 2017
dongjoon-hyun pushed a commit that referenced this pull request Jun 28, 2023
### What changes were proposed in this pull request?
This pr aims to upgrade netty from 4.1.92 to 4.1.93.

### Why are the changes needed?
1.v4.1.92 VS v4.1.93
netty/netty@netty-4.1.92.Final...netty-4.1.93.Final

2.The new version brings some bug fix, eg:
- Reset byte buffer in loop for AbstractDiskHttpData.setContent ([#13320](netty/netty#13320))
- OpenSSL MAX_CERTIFICATE_LIST_BYTES option supported ([#13365](netty/netty#13365))
- Adapt to DirectByteBuffer constructor in Java 21 ([#13366](netty/netty#13366))
- HTTP/2 encoder: allow HEADER_TABLE_SIZE greater than Integer.MAX_VALUE ([#13368](netty/netty#13368))
- Upgrade to latest netty-tcnative to fix memory leak ([#13375](netty/netty#13375))
- H2/H2C server stream channels deactivated while write still in progress ([#13388](netty/netty#13388))
- Channel#bytesBefore(un)writable off by 1 ([#13389](netty/netty#13389))
- HTTP/2 should forward shutdown user events to active streams ([#13394](netty/netty#13394))
- Respect the number of bytes read per datagram when using recvmmsg ([#13399](netty/netty#13399))

3.The release notes as follows:
- https://netty.io/news/2023/05/25/4-1-93-Final.html

4.Why not upgrade to `4-1-94-Final` version?
Because the return value of the 'threadCache()' (from `PoolThreadCache` to `PoolArenasCache`) method of the netty Inner class used in the 'arrow memory netty' version '12.0.1' has changed and belongs to break change, let's wait for the upgrade of the 'arrow memory netty' before upgrading to the '4-1-94-Final' version.

The reference is as follows:
https://github.com/apache/arrow/blob/6af660f48472b8b45a5e01b7136b9b040b185eb1/java/memory/memory-netty/src/main/java/io/netty/buffer/PooledByteBufAllocatorL.java#L164
https://github.com/netty/netty/blob/da1a448d5bc4f36cc1744db93fcaf64e198db2bd/buffer/src/main/java/io/netty/buffer/PooledByteBufAllocator.java#L732-L736

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GA.

Closes #41681 from panbingkun/upgrade_netty.

Authored-by: panbingkun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants