Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-10618] [Mesos] Refactoring scheduling condition and adding test for same #10326

Closed
wants to merge 1,172 commits into from
Closed

[SPARK-10618] [Mesos] Refactoring scheduling condition and adding test for same #10326

wants to merge 1,172 commits into from

Conversation

SleepyThread
Copy link
Contributor

PR deals with scheduling condition changes. Refer discussion from #8771

@SleepyThread
Copy link
Contributor Author

@andrewor14 @tnachen @dragos please do another pass on this PR.

@dragos
Copy link
Contributor

dragos commented Dec 16, 2015

I'll review this one, but personally I see no point in this PR unless it improves something user-facing. That means the todo comment should be implemented (fine-grained logging on what condition actually failed).

@@ -308,6 +303,23 @@ private[spark] class CoarseMesosSchedulerBackend(
}
}

// ToDo: Abstract out each condition and log them.
def isOfferSatisfiesRequirements(slaveId: String, mem: Double, cpusOffered: Int,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name sounds a bit funny to me. Maybe just offerSatisfiesRequirements?

Also, this might be private, or private[spark] to allow testing.

Ideally the TODO would be implemented as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, a short scaladoc explaining the difference between meetsConstraints and satisfiesRequirements. To the casual reader they are the same.

@andrewor14
Copy link
Contributor

ok to test (@dragos @tnachen you can also say this by the way, just gave you permissions)

@tnachen
Copy link
Contributor

tnachen commented Dec 16, 2015

Don't have anything else to add besides what @dragos said, but seems like it takes a while to get this updated. I vote for trying to merge this first as this adds more tests 👍
We can add finer grain logging later

@SparkQA
Copy link

SparkQA commented Dec 16, 2015

Test build #47850 has finished for PR 10326 at commit f1bb5b1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -308,6 +303,23 @@ private[spark] class CoarseMesosSchedulerBackend(
}
}

// ToDo: Abstract out each condition and log them.
def isOfferSatisfiesRequirements(slaveId: String, mem: Double, cpusOffered: Int,
sc: SparkContext): Boolean = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style:

def offerSatisfiesRequirements(
    slaveId: String,
    mem: Double,
    cpusOffered: Int,
    sc: SparkContext): Boolean = {
  ...
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, the method signature is shared across both fine-grained and coarse-grained modes, so I would put this in the helper trait MesosSchedulerUtils. Then this can be protected or private[mesos] or something.

@andrewor14
Copy link
Contributor

@SleepyThread thanks for cleaning up the code and adding tests. I think the patch in its current state is only half way there, however. There's more opportunity to abstract the duplicate code. Also there are still a lot of style violations, and it no longer merges into master. Once you address my and @dragos' comments then we'll take a closer look.

@dragos
Copy link
Contributor

dragos commented Feb 16, 2016

@SleepyThread, user name checks! :) Any progress?

zjffdu and others added 10 commits February 23, 2016 02:49
…for registerFunction [Python]

Straightforward change on the python doc

Author: Jeff Zhang <[email protected]>

Closes #9901 from zjffdu/SPARK-11860.
…ted with a Stage

This issue was addressed in #5494, but the fix in that PR, while safe in the sense that it will prevent the SparkContext from shutting down, misses the actual bug.  The intent of `submitMissingTasks` should be understood as "submit the Tasks that are missing for the Stage, and run them as part of the ActiveJob identified by jobId".  Because of a long-standing bug, the `jobId` parameter was never being used.  Instead, we were trying to use the jobId with which the Stage was created -- which may no longer exist as an ActiveJob, hence the crash reported in SPARK-6880.

The correct fix is to use the ActiveJob specified by the supplied jobId parameter, which is guaranteed to exist at the call sites of submitMissingTasks.

This fix should be applied to all maintenance branches, since it has existed since 1.0.

kayousterhout pankajarora12

Author: Mark Hamstra <[email protected]>
Author: Imran Rashid <[email protected]>

Closes #6291 from markhamstra/SPARK-6880.
- NettyRpcEnv::openStream() now correctly propagates errors to
  the read side of the pipe.
- NettyStreamManager now throws if the file being transferred does
  not exist.
- The network library now correctly handles zero-sized streams.

Author: Marcelo Vanzin <[email protected]>

Closes #9941 from vanzin/SPARK-11956.
…ython

Author: felixcheung <[email protected]>

Closes #9967 from felixcheung/pypivotdoc.
…VM exits

deleting the temp dir like that

```

scala> import scala.collection.mutable
import scala.collection.mutable

scala> val a = mutable.Set(1,2,3,4,7,0,8,98,9)
a: scala.collection.mutable.Set[Int] = Set(0, 9, 1, 2, 3, 7, 4, 8, 98)

scala> a.foreach(x => {a.remove(x) })

scala> a.foreach(println(_))
98
```

You may not modify a collection while traversing or iterating over it.This can not delete all element of the collection

Author: Zhongshuai Pei <[email protected]>

Closes #9951 from DoingDone9/Bug_RemainDir.
Currently, we does not have visualization for SQL query from Python, this PR fix that.

cc zsxwing

Author: Davies Liu <[email protected]>

Closes #9949 from davies/pyspark_sql_ui.
Author: Yu ISHIKAWA <[email protected]>

Closes #9960 from yu-iskw/minor-remove-spaces.
Author: Jeff Zhang <[email protected]>

Closes #9956 from zjffdu/dev_typo.
Currently the Web UI navbar has a minimum width of 1200px; so if a window is resized smaller than that the app name goes off screen. The 1200px width seems to have been chosen since it fits the longest example app name without wrapping.

To work with smaller window widths I made the tabs wrap since it looked better than wrapping the app name. This is a distinct change in how the navbar looks and I'm not sure if it's what we actually want to do.

Other notes:
- min-width set to 600px to keep the tabs from wrapping individually (will need to be adjusted if tabs are added)
- app name will also wrap (making three levels) if a really really long app name is used

Author: Alex Bozarth <[email protected]>

Closes #9874 from ajbozarth/spark10864.
…rk-env.cmd from wrong directory

* On windows the `bin/load-spark-env.cmd` tries to load `spark-env.cmd` from `%~dp0..\..\conf`, where `~dp0` points to `bin` and `conf` is only one level up.
* Updated `bin/load-spark-env.cmd` to load `spark-env.cmd` from `%~dp0..\conf`, instead of `%~dp0..\..\conf`

Author: wangt <[email protected]>

Closes #9863 from toddwan/master.
HyukjinKwon and others added 27 commits February 23, 2016 02:49
…ly pushed down

https://issues.apache.org/jira/browse/SPARK-12236
Currently JDBC filters are not tested properly. All the tests pass even if the filters are not pushed down due to Spark-side filtering.

In this PR,
Firstly, I corrected the tests to properly check the pushed down filters by removing Spark-side filtering.
Also, `!=` was being tested which is actually not pushed down. So I removed them.
Lastly, I moved the `stripSparkFilter()` function to `SQLTestUtils` as this functions would be shared for all tests for pushed down filters. This function would be also shared with ORC datasource as the filters for that are also not being tested properly.

Author: hyukjinkwon <[email protected]>

Closes #10221 from HyukjinKwon/SPARK-12236.
Author: Jean-Baptiste Onofré <[email protected]>

Closes #10130 from jbonofre/SPARK-12105.
…ling setConf

This is continuation of SPARK-12056 where change is applied to SqlNewHadoopRDD.scala

andrewor14
FYI

Author: tedyu <[email protected]>

Closes #10164 from tedyu/master.
…n ExternalShuffleBlockResolver

Replace shuffleManagerClassName with shortShuffleMgrName is  to reduce time of string's comparison. and put sort's comparison on the front. cc JoshRosen andrewor14

Author: Lianhui Wang <[email protected]>

Closes #10131 from lianhuiwang/spark-12130.
…sos cluster mode.

Adding more documentation about submitting jobs with mesos cluster mode.

Author: Timothy Chen <[email protected]>

Closes #10086 from tnachen/mesos_supervise_docs.
https://issues.apache.org/jira/browse/SPARK-9516

- [x] new look of Thread Dump Page

- [x] click column title to sort

- [x] grep

- [x] search as you type

squito JoshRosen It's ready for the review now

Author: CodingCat <[email protected]>

Closes #7910 from CodingCat/SPARK-9516.
…d AsyncRDDActions to support non-blocking operation

These changes rework the implementations of `SimpleFutureAction`, `ComplexFutureAction`, `JobWaiter`, and `AsyncRDDActions` such that asynchronous callbacks on the generated `Futures` NEVER block waiting for a job to complete. A small amount of mutex synchronization is necessary to protect the internal fields that manage cancellation, but these locks are only held very briefly and in practice should almost never cause any blocking to occur. The existing blocking APIs of these classes are retained, but they simply delegate to the underlying non-blocking API and `Await` the results with indefinite timeouts.

Associated JIRA ticket: https://issues.apache.org/jira/browse/SPARK-9026
Also fixes: https://issues.apache.org/jira/browse/SPARK-4514

This pull request contains all my own original work, which I release to the Spark project under its open source license.

Author: Richard W. Eggert II <[email protected]>

Closes #9264 from reggert/fix-futureaction.
Please help to review, thanks a lot.

Author: jerryshao <[email protected]>

Closes #10195 from jerryshao/SPARK-10123.
ExternalBlockStore.scala

Author: Naveen <[email protected]>

Closes #10313 from naveenminchu/branch-fix-SPARK-9886.
… completes

This change builds the event history of completed apps asynchronously so the RPC thread will not be blocked and allow new workers to register/remove if the event log history is very large and takes a long time to rebuild.

Author: Bryan Cutler <[email protected]>

Closes #10284 from BryanCutler/async-MasterUI-SPARK-12062.
Spark on Yarn handle AM being told command from RM

When RM throws ApplicationAttemptNotFoundException for allocate
invocation, making the ApplicationMaster to finish immediately without any
retries.

Author: Devaraj K <[email protected]>

Closes #10129 from devaraj-kavali/SPARK-4117.
…endly Receiver graphs

Currently, the Spark Streaming web UI uses the same maxY when displays 'Input Rate Times& Histograms' and 'Per-Receiver Times& Histograms'.

This may lead to somewhat un-friendly graphs: once we have tens of Receivers or more, every 'Per-Receiver Times' line almost hits the ground.

This issue proposes to calculate a new maxY against the original one, which is shared among all the `Per-Receiver Times& Histograms' graphs.

Before:
![before-5](https://cloud.githubusercontent.com/assets/15843379/11761362/d790c356-a0fa-11e5-860e-4b834603de1d.png)

After:
![after-5](https://cloud.githubusercontent.com/assets/15843379/11761361/cfabf692-a0fa-11e5-97d0-4ad124aaca2a.png)

Author: proflin <[email protected]>

Closes #10318 from proflin/SPARK-12304.
…own.

https://issues.apache.org/jira/browse/SPARK-12249
Currently `!=` operator is not pushed down correctly.
I simply added a case for this.

Author: hyukjinkwon <[email protected]>

Closes #10233 from HyukjinKwon/SPARK-12249.
https://issues.apache.org/jira/browse/SPARK-12314
`IsNull` filter is not being pushed down for JDBC datasource.

It looks it is SQL standard according to [SQL-92](http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt), SQL:1999, [SQL:2003](http://www.wiscorp.com/sql_2003_standard.zip) and [SQL:201x](http://www.wiscorp.com/sql20nn.zip) and I believe most databases support this.

In this PR, I simply added the case for `IsNull` filter to produce a proper filter string.

Author: hyukjinkwon <[email protected]>

This patch had conflicts when merged, resolved by
Committer: Reynold Xin <[email protected]>

Closes #10286 from HyukjinKwon/SPARK-12314.
…urce.

https://issues.apache.org/jira/browse/SPARK-12315
`IsNotNull` filter is not being pushed down for JDBC datasource.

It looks it is SQL standard according to [SQL-92](http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt), SQL:1999, [SQL:2003](http://www.wiscorp.com/sql_2003_standard.zip) and [SQL:201x](http://www.wiscorp.com/sql20nn.zip) and I believe most databases support this.

In this PR, I simply added the case for `IsNotNull` filter to produce a proper filter string.

Author: hyukjinkwon <[email protected]>

This patch had conflicts when merged, resolved by
Committer: Reynold Xin <[email protected]>

Closes #10287 from HyukjinKwon/SPARK-12315.
…nto scheduling-refactor

Conflicts:
	core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackend.scala
@SparkQA
Copy link

SparkQA commented Feb 23, 2016

Test build #51710 has finished for PR 10326 at commit 412f56d.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Jun 15, 2016

Thanks for the pull request. I'm going through a list of pull requests to cut them down since the sheer number is breaking some of the tooling we have. Due to lack of activity on this pull request, I'm going to push a commit to close it. Feel free to reopen it or create a new one. We can also continue the discussion on the JIRA ticket.

@asfgit asfgit closed this in 1a33f2e Jun 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.