Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose SparkListeners and relevant classes as DeveloperApi #648

Closed
wants to merge 2 commits into from

Conversation

andrewor14
Copy link
Contributor

Hopefully this can go into 1.0, as a few people on the user list have asked for this.

* Identifies a particular Block of data, usually associated with a single file.
* A Block can be uniquely identified by its filename, but each type of Block has a different
* set of keys which produce its unique name.
*
* If your BlockId should be serializable, be sure to add it to the BlockId.apply() method.
*/
private[spark] sealed abstract class BlockId {
@DeveloperApi
sealed abstract class BlockId {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just expose BlockId's as strings instead of revealing this class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean exposing name and toString only?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just mean we should pass these around as String's in the events. Would that be a big change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it makes sense to give users some more semantics about this... it might be good to leave it as is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be a fairly big change, since we technically don't need the BlockId classes anymore, as the strings are supposedly globally unique. I'm inclined to just leave it.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14670/

@pwendell
Copy link
Contributor

pwendell commented May 6, 2014

Thanks, I merged this.

@asfgit asfgit closed this in ea10b31 May 6, 2014
asfgit pushed a commit that referenced this pull request May 6, 2014
Hopefully this can go into 1.0, as a few people on the user list have asked for this.

Author: Andrew Or <[email protected]>

Closes #648 from andrewor14/expose-listeners and squashes the following commits:

e45e1ef [Andrew Or] Add missing colons (minor)
350d643 [Andrew Or] Expose SparkListeners and relevant classes as DeveloperApi
(cherry picked from commit ea10b31)

Signed-off-by: Patrick Wendell <[email protected]>
@andrewor14 andrewor14 deleted the expose-listeners branch May 6, 2014 17:10
pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
Hopefully this can go into 1.0, as a few people on the user list have asked for this.

Author: Andrew Or <[email protected]>

Closes apache#648 from andrewor14/expose-listeners and squashes the following commits:

e45e1ef [Andrew Or] Add missing colons (minor)
350d643 [Andrew Or] Expose SparkListeners and relevant classes as DeveloperApi
vanzin pushed a commit to vanzin/spark that referenced this pull request Apr 25, 2017
These listeners weren't really meant for external consumption, but they're
public and marked with @DeveloperAPI. Adding the deprecated tag warns people
that they may soon go away (as they will as part of the work for SPARK-18085.

Note that not all types made public by apache#648
are being deprecated. Some remaining types are still exposed through the
SparkListener API.

Also note the text for StorageStatus is a tiny bit different, since I'm not
so sure I'll be able to remove it. But the effect for the users should be the
same (they should stop trying to use it).
asfgit pushed a commit that referenced this pull request Apr 27, 2017
These listeners weren't really meant for external consumption, but they're
public and marked with DeveloperApi. Adding the deprecated tag warns people
that they may soon go away (as they will as part of the work for SPARK-18085).

Note that not all types made public by #648
are being deprecated. Some remaining types are still exposed through the
SparkListener API.

Also note the text for StorageStatus is a tiny bit different, since I'm not
so sure I'll be able to remove it. But the effect for the users should be the
same (they should stop trying to use it).

Author: Marcelo Vanzin <[email protected]>

Closes #17766 from vanzin/SPARK-20421.

(cherry picked from commit 561e9cc)
Signed-off-by: Marcelo Vanzin <[email protected]>
ghost pushed a commit to dbtsai/spark that referenced this pull request Apr 27, 2017
These listeners weren't really meant for external consumption, but they're
public and marked with DeveloperApi. Adding the deprecated tag warns people
that they may soon go away (as they will as part of the work for SPARK-18085).

Note that not all types made public by apache#648
are being deprecated. Some remaining types are still exposed through the
SparkListener API.

Also note the text for StorageStatus is a tiny bit different, since I'm not
so sure I'll be able to remove it. But the effect for the users should be the
same (they should stop trying to use it).

Author: Marcelo Vanzin <[email protected]>

Closes apache#17766 from vanzin/SPARK-20421.
RolatZhang pushed a commit to RolatZhang/spark that referenced this pull request Aug 18, 2023
…Utils.unpack (apache#643) (apache#648)

### What changes were proposed in this pull request?

This PR proposes to use `FileUtil.unTarUsingJava` that is a Java implementation for un-tar `.tar` files. `unTarUsingJava` is not public but it exists in all Hadoop versions from 2.1+, see HADOOP-9264.

The security issue reproduction requires a non-Windows platform, and a non-gzipped TAR archive file name (contents don't matter).

### Why are the changes needed?

There is a risk for arbitrary shell command injection via `Utils.unpack` when the filename is controlled by a malicious user. This is due to an issue in Hadoop's `unTar`, that is not properly escaping the filename before passing to a shell command:https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileUtil.java#L904

### Does this PR introduce _any_ user-facing change?

Yes, it prevents a security issue that, previously, allowed users to execute arbitrary shall command.

### How was this patch tested?

Manually tested in local, and existing test cases should cover.

Closes apache#35946 from HyukjinKwon/SPARK-38631.

Authored-by: Hyukjin Kwon <[email protected]>

(cherry picked from commit 057c051)

Signed-off-by: Hyukjin Kwon <[email protected]>
Co-authored-by: Hyukjin Kwon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants