Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-23809][SQL][backport] Active SparkSession should be set by getOrCreate #20971

Closed
wants to merge 3 commits into from

Conversation

ericl
Copy link
Contributor

@ericl ericl commented Apr 4, 2018

This backports #20927 to branch-2.3

What changes were proposed in this pull request?

Currently, the active spark session is set inconsistently (e.g., in createDataFrame, prior to query execution). Many places in spark also incorrectly query active session when they should be calling activeSession.getOrElse(defaultSession) and so might get None even if a Spark session exists.

The semantics here can be cleaned up if we also set the active session when the default session is set.

Related: https://github.com/apache/spark/pull/20926/files

How was this patch tested?

Unit test, existing test. Note that if #20926 merges first we should also update the tests there.

## What changes were proposed in this pull request?

Currently, the active spark session is set inconsistently (e.g., in createDataFrame, prior to query execution). Many places in spark also incorrectly query active session when they should be calling activeSession.getOrElse(defaultSession) and so might get None even if a Spark session exists.

The semantics here can be cleaned up if we also set the active session when the default session is set.

Related: https://github.com/apache/spark/pull/20926/files

## How was this patch tested?

Unit test, existing test. Note that if apache#20926 merges first we should also update the tests there.

Author: Eric Liang <[email protected]>

Closes apache#20927 from ericl/active-session-cleanup.
@ericl
Copy link
Contributor Author

ericl commented Apr 4, 2018

@gatorsmile here's the patch

Copy link
Member

@gatorsmile gatorsmile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending Jenkins

@SparkQA
Copy link

SparkQA commented Apr 4, 2018

Test build #88865 has finished for PR 20971 at commit e429af1.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 4, 2018

Test build #88867 has finished for PR 20971 at commit 36fa1bd.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Apr 4, 2018

Test build #88870 has finished for PR 20971 at commit 36fa1bd.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ericl
Copy link
Contributor Author

ericl commented Apr 5, 2018

retest this please

1 similar comment
@gatorsmile
Copy link
Member

retest this please

@gatorsmile
Copy link
Member

ok to test

@SparkQA
Copy link

SparkQA commented Apr 6, 2018

Test build #88961 has finished for PR 20971 at commit 36fa1bd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

Thanks! Merged to 2.3

asfgit pushed a commit that referenced this pull request Apr 8, 2018
…OrCreate

This backports #20927 to branch-2.3

## What changes were proposed in this pull request?

Currently, the active spark session is set inconsistently (e.g., in createDataFrame, prior to query execution). Many places in spark also incorrectly query active session when they should be calling activeSession.getOrElse(defaultSession) and so might get None even if a Spark session exists.

The semantics here can be cleaned up if we also set the active session when the default session is set.

Related: https://github.com/apache/spark/pull/20926/files

## How was this patch tested?

Unit test, existing test. Note that if #20926 merges first we should also update the tests there.

Author: Eric Liang <[email protected]>

Closes #20971 from ericl/backport-23809.
@ericl ericl closed this Apr 9, 2018
peter-toth pushed a commit to peter-toth/spark that referenced this pull request Oct 6, 2018
…OrCreate

This backports apache#20927 to branch-2.3

Currently, the active spark session is set inconsistently (e.g., in createDataFrame, prior to query execution). Many places in spark also incorrectly query active session when they should be calling activeSession.getOrElse(defaultSession) and so might get None even if a Spark session exists.

The semantics here can be cleaned up if we also set the active session when the default session is set.

Related: https://github.com/apache/spark/pull/20926/files

Unit test, existing test. Note that if apache#20926 merges first we should also update the tests there.

Author: Eric Liang <[email protected]>

Closes apache#20971 from ericl/backport-23809.

Change-Id: Iee11470899bc67d0ad9f096d37f8705174593f3e
otterc pushed a commit to linkedin/spark that referenced this pull request Mar 22, 2023
1) Use host/domain from dfsCluster
2) Use semaphores to remove timing based flakeyness
3) Ensure spark context is closed
4) Ignore ParquetAvroCompatibilitySuite temporarily
5) Adaptation of SPARK-28247 to 2.3
6) Disable codegen tests - these are fixed in 2.4 and 3.x, but require sql backports to 2.3
7) PR apache#21197 to handle test failures due to leaking spark session
8) PR#20926: Set default Spark session in test-only spark sessions
9) PR apache#20971: Active SparkSession should be set by getOrCreate
10) PR apache#21446: Random.nextString is not safe for directory namePrefix
11) Disabling QueryStageSuite#'adaptive skewed join'
12) SPARK-24318: Fix flakey SortShuffleSuite
13) Fix ForeachSinkSuite test failure
14) Diabling StreamingQuerySuite.'status, lastProgress, and recentProgress'
15) Disabling ResolvedDataSourceSuite.'avro: show deploy guide for loading the external avro module'
16) PR# 23405: Avoid to use Random.nextString in StreamingInnerJoinSuite
17) Adaptation of PR#25849, add LinkedIn repo as default to spark.sql.maven.additionalRemoteRepositories
18) Disabling HiveExternalCatalogVersionsSuite - only 2.4.6 and 3.0.0 are available from apache mirrors
19) Disabling INFER_AND_SAVE test in HiveSchemaInferenceSuite

- Address Erik's review comments
otterc pushed a commit to linkedin/spark that referenced this pull request Mar 22, 2023
Currently, the active spark session is set inconsistently (e.g., in createDataFrame, prior to query execution). Many places in spark also incorrectly query active session when they should be calling activeSession.getOrElse(defaultSession) and so might get None even if a Spark session exists.

The semantics here can be cleaned up if we also set the active session when the default session is set.

Related: https://github.com/apache/spark/pull/20926/files

Unit test, existing test. Note that if apache#20926 merges first we should also update the tests there.

Author: Eric Liang <[email protected]>

Closes apache#20927 from ericl/active-session-cleanup.

(cherry picked from commit 359375e)

NOTE: This cherry-pick includes only some of the original changes, because LIHADOOP-54684 already made some identical
changes as part of test resolution. This now ensures the entirety of the original SPARK-23809 code is backported.
Also note that when SPARK-23809 was originally backported to branch-2.3 in PR apache#20971, the new `active` API was _not_
included since new APIs shouldn't generally be added in patch releases. That new API is exactly what is needed,
so we backport directly from the original commit.

RB=2575134
BUG=BDP-6088
G=spark-reviewers
R=mmuralid,wyzhang,smahadik
A=mmuralid,wyzhang
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants