Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem in connecting to hdfs #367

Open
ebrahim-abbasi opened this issue Feb 9, 2020 · 5 comments
Open

problem in connecting to hdfs #367

ebrahim-abbasi opened this issue Feb 9, 2020 · 5 comments
Labels

Comments

@ebrahim-abbasi
Copy link

Dear there,
I am using pydoop in combination with pyftpdlib to provide a FTP server for HDFS. I followed the installation instructions to setup the hadoop. I have a hadoop client connecting to a remote HDFS and from pydoop I am connecting to the hadoop client.
When executing the 'hadoop classpath --glob' command it is ok. But in the pydoop/hadoop_utils.py file for the code line of " cp = subprocess.check_output(hadoop classpath --glob", shell=True, universal_newlines=True ).strip()" I am getting this error:

subprocess.CalledProcessError: Command 'hadoop classpath --glob' returned non-zero exit status 127.

Could you please let me know how can I fix this issue?
Best

@simleo
Copy link
Member

simleo commented Feb 10, 2020

You usually get exit status 127 when bash does not find the command you're trying to run. Make sure the hadoop executable is in the PATH. See Environment Setup in the docs.

@ebrahim-abbasi
Copy link
Author

ebrahim-abbasi commented Feb 12, 2020

@simleo Thanks for your reply.
Here is the content of my etc/environment:

PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/home/abbasi/software/hadoop-2.10.0:/home/abbasi/software/hadoop-2.10.0/bin:/home/abbasi/software/hadoop-2.10.0/sbin"
HADOOP_HOME="/home/abbasi/software/hadoop-2.10.0"
HADOOP_INSTALL="/home/abbasi/software/hadoop-2.10.0"
HADOOP_MAPRED_HOME="/home/abbasi/software/hadoop-2.10.0"
HADOOP_COMMON_HOME="/home/abbasi/software/hadoop-2.10.0"
HADOOP_HDFS_HOME="/home/abbasi/software/hadoop-2.10.0"
YARN_HOME="/home/abbasi/software/hadoop-2.10.0"
HADOOP_COMMON_LIB_NATIVE_DIR="/home/abbasi/software/hadoop-2.10.0/lib/native"
HADOOP_OPTS="-Djava.library.path=/home/abbasi/software/hadoop-2.10.0/lib/native"
HADOOP_CONF_DIR="/home/abbasi/software/hadoop-2.10.0/etc/hadoop"
CLASSPATH="/home/abbasi/software/hadoop-2.10.0/etc/hadoop:/home/abbasi/software/hadoop-2.10.0/share/hadoop/common/lib/:/home/abbasi/software/hadoop-2.10.0/share/hadoop/common/:/home/abbasi/software/hadoop-2.10.0/share/hadoop/hdfs:/home/abbasi/software/hadoop-2.10.0/share/hadoop/hdfs/lib/:/home/abbasi/software/hadoop-2.10.0/share/hadoop/hdfs/:/home/abbasi/software/hadoop-2.10.0/share/hadoop/yarn:/home/abbasi/software/hadoop-2.10.0/share/hadoop/yarn/lib/:/home/abbasi/software/hadoop-2.10.0/share/hadoop/yarn/:/home/abbasi/software/hadoop-2.10.0/share/hadoop/mapreduce/lib/:/home/abbasi/software/hadoop-2.10.0/share/hadoop/mapreduce/:/home/abbasi/software/hadoop-2.10.0/contrib/capacity-scheduler/*.jar"

Should I add something more?

@ilveroluca
Copy link
Member

You need to make sure make sure the hadoop executable is in one of the directories listed in the PATH environment variables. If it isn't, fix the PATH list.

@ebrahim-abbasi
Copy link
Author

ebrahim-abbasi commented Feb 12, 2020

Is that enough to put jar files returned back by the hadoop classpath --glob in the PATH variable?

My current PATH variable is:

/home/abbasi/pyftpdlib_venv/bin:/home/abbasi/software/hadoop-2.10.0/etc/hadoop:/home/abbasi/software/hadoop-2.10.0:/home/abbasi/software/hadoop-2.10.0/bin:/home/abbasi/software/hadoop-2.10.0/sbin:/usr/lib/jvm/java-8-openjdk-amd64/bin:/home/abbasi/.sdkman/candidates/scala/current/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin:

@simleo
Copy link
Member

simleo commented Feb 13, 2020

Don't worry about adding the jar files, Pydoop handles that automatically. It only needs the hadoop command to be in the PATH. Try something like:

export PATH="/home/abbasi/software/hadoop-2.10.0/bin:/home/abbasi/software/hadoop-2.10.0/sbin:${PATH}"

And please double check that the hadoop executable script is indeed in /home/abbasi/software/hadoop-2.10.0/bin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants