-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
getRackForHost returns None if host is unknown by driver #17238
Conversation
Can one of the admins verify this patch? |
Please change the title per http://spark.apache.org/contributing.html |
The description of the problem makes sense for how it would effect task locality. I will need to look more closely at some yarn bits to make sure its the right change, but looks reasonable. |
This is something to be configured appropriately at the lower level (namely topology config), and should be handled there Special casing this in spark is not right.
|
@mridulm you know the yarn bits involved better than I do. It sounds like you are saying this is the wrong change, and instead its just a misconfiguration in yarn -- I'll buy that argument. Essentially you are saying that "DEFAULT_RACK" may be used legitimately for one of the racks. Even with this miconfiguration, the incorrect locality assignments don't seem horrible, except that for very short tasks its exacerbated by https://issues.apache.org/jira/browse/SPARK-18886 |
@squito I looked at yarn bits about 3-4 years back, so I am sure you and @tgravescs know better :-) I am not sure what the impact of SPARK-18886 here might be ... |
Sorry if I'm missing something here but I don't see why this is a problem? If you have YARN misconfigured or not configured everything is going to default to DEFAULT_RACK. If you want them to be on different racks then fix the yarn configuration. The topology mapping in hadoop is a plugin so someone could write a plugin that just returned DEFAULT_RACK for anything, if you don't want that behavior you can write a plugin that does something different. @morenn520 I'm assuming in this case the tasks had some host locality level before looking at the rack level? I don't believe tez/MR do anything special with respect to the DEFAULT_RACK but I'll double check |
I don't think you're missing anything @tgravescs , it sounds like this is just a misconfiguration and we shouldn't be doing anything special for it (since it could hurt correct configurations). I wanted to see if this was a particularly common / easy misconfiguration in yarn, but you both have convinced me its not. my point about SPARK-18886 was to think through how bad the effect from the misconfiguration is. In the originally described scenario, if a few nodes are accidentally labelled as belonging to their own rack, then sometimes task will get preferentially assigned to this false-rack at first. If tasks are long, its not that big a deal -- after a short delay, you'll then assign the tasks to the rest of the cluster. But if tasks are short, b/c of SPARK-18886, you may just keep assigning to the nodes in your false-rack, and leave the rest of the cluster idle. Nonetheless, I think its still just a misconfiguration. sounds like this issue is a "won't fix", unless @morenn520 makes the case otherwise. |
@squito Actually it's a misconfiguration in our yarn. There are thousands of nodes in our production yarn. It will be quite frequent to rack/unrack nodes, so as to make something missing easily. It would be better if spark does something special to avoid our misconfiguration. |
If you aren't adding in machines to rack and configuring yarn properly before adding it to your cluster that is a process issue you should fix on your end. I would assume a unracking/racking a node means putting in a new node? If that is the case you have to install hadoop and hadoop configuration on that node. I would expect you to fix the configuration or have a generic rack aware script/java class that would be able to just figure it out, but that is going to be pretty configuration specific. I would also assume if you have that configuration wrong then your HDFS is also not being optimal as it could get the replication wrong. You can specify your own class/script to do the rack resolution so you could change that to handle this case: see https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/RackAwareness.html I'll try to check on tez/mr today and get back to you (was to busy yesterday). I know tez didn't have any explicit references to DEFAULT_RACK in the code but want to look a bit more. |
Ok checked tez and mr and they don't do this. |
Actually, to play devil's advocate, the problem @morenn520 is describing is a little more involved. You have a driver running, which has its own view of what the cluster topology is, and then the cluster topology changes underneath it. Deploying a new configuration on the new nodes being added does not fix the driver, unless your "topology discovery script" is fully dynamic and always goes to a central location to figure out what's the current topology. So, basically, even if the new nodes know about the updated topology, the existing driver instance might not, and there's no easy way to fix that I can see. (That being said, I haven't looked at the changes here.) |
Thanks @tgravescs , we has used a dynamic topology discovery script to avoid this problem. Since tez and mr don‘t do this, maybe spark could not fix this problem. |
Maybe I'm misunderstanding what you are saying, but the only way the AM gets a bad topology is if its wrong in the first place. Or are you just saying app starts and host is in one rack, host gets moved to another rack and brought back up? I guess that is possible, but I'm not really sure that applies to this case here anyway with default_rack. Any existing executors would have gone away on it when the host was moved so yarn should re-resolve when it gets a new container anyway. If your script isn't dynamic to handle that then its also a configuration issue to update all the other hosts and you should do that before bringing the host up. Again unless you aren't using HDFS the rack resolve affects more then just spark on yarn here. Its going to affect HDFS block placements and other types of apps. |
Closes apache#11785 Closes apache#13027 Closes apache#13614 Closes apache#13761 Closes apache#15197 Closes apache#14006 Closes apache#12576 Closes apache#15447 Closes apache#13259 Closes apache#15616 Closes apache#14473 Closes apache#16638 Closes apache#16146 Closes apache#17269 Closes apache#17313 Closes apache#17418 Closes apache#17485 Closes apache#17551 Closes apache#17463 Closes apache#17625 Closes apache#10739 Closes apache#15193 Closes apache#15344 Closes apache#14804 Closes apache#16993 Closes apache#17040 Closes apache#15180 Closes apache#17238
This pr proposed to close stale PRs. Currently, we have 400+ open PRs and there are some stale PRs whose JIRA tickets have been already closed and whose JIRA tickets does not exist (also, they seem not to be minor issues). // Open PRs whose JIRA tickets have been already closed Closes apache#11785 Closes apache#13027 Closes apache#13614 Closes apache#13761 Closes apache#15197 Closes apache#14006 Closes apache#12576 Closes apache#15447 Closes apache#13259 Closes apache#15616 Closes apache#14473 Closes apache#16638 Closes apache#16146 Closes apache#17269 Closes apache#17313 Closes apache#17418 Closes apache#17485 Closes apache#17551 Closes apache#17463 Closes apache#17625 // Open PRs whose JIRA tickets does not exist and they are not minor issues Closes apache#10739 Closes apache#15193 Closes apache#15344 Closes apache#14804 Closes apache#16993 Closes apache#17040 Closes apache#15180 Closes apache#17238 N/A Author: Takeshi Yamamuro <[email protected]> Closes apache#17734 from maropu/resolved_pr. Change-Id: Id2e590aa7283fe5ac01424d30a40df06da6098b5
What changes were proposed in this pull request?
https://issues.apache.org/jira/browse/SPARK-19894
How was this patch tested?
It tests on our production cluster(YARN) by YARN-cluster mode, and resolve user rack-local problems by applying this patch.
Problem:
In our production cluster(YARN), one node(called missing-rack-info node) miss some rack information for other nodes. One Spark Streaming program(Datasource: Kafka, Mode: Yarn-cluster), runs driver on this missing-rack-info node.
The nodes whose host is missed on Driver node, and the Kafka broker node whose host is also unknown by YARN, would both be recognized as "/default-rack" by YARN scheduler, so that all tasks would be assigned to the nodes for RACK_LOCAL.