-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
THDFSFile fixes #94
THDFSFile fixes #94
Conversation
Is this change necessary? Main reasons to ask are |
Well, it isn't necessary. Without this change user have to specify the absolute path by adding one more / in front of it. E.g. hdfs://host:port//user/username/dir/file (note the extra slash after host). I personally think that's kind of weird. Anyway, the problem is that there is no way (to my knowledge) to point to relative path in Hadoop using hdfs:// prefix. So Hadoop uses single slash to point to absolute url. This URL will be parsed incorrectly by current THDFSFile by assuming that the path is relative to user directory. The absolute URLs with extra leading slashes are, however, correctly parsed by Hadoop. So there is a compatibility to some extent. |
@smithdh Would you be able to review (and rebase) this PR? |
Can one of the admins verify this patch? |
@smithdh ping... |
Hello @evgeny-boger Could you select "allow edits from maintainers" for this PR (I think this may not be set at the moment), so that I can rebase? |
@smithdh, done |
compatible with libhdfs3 (native HDFS client library) which only implements the latter.
…ch Hadoop conventions. fixes HDFS path handling in various methods to make them actually work.
by calling hdfsRead several tyimes.
4a00643
to
41488bd
Compare
@phsft-bot build! @smithdh and @evgeny-boger thanks rebasing! Could you fix the clang-format issues? |
Starting build on |
hi @vgvassilev Yes, certainly. |
@phsft-bot build! |
Starting build on |
This set of patches makes THDFSFile work again.
It also enables CMake build and allows linking against libhdfs3 (experimental native HDFS client implementation).
Kind of major change: HDFS URLs are now absolute instead of relative as it was before. I.e. one have to use "hdfs:///user/username/dir1/file2.root" notation to access file in the home directory.
This makes HDFS URLs somewhat standard in the sense that they could be used interchangeably between ROOT and Hadoop API and command-line utilities.