Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to read HDFS in Map of RMR2 #229

Open
sureshappana opened this issue Dec 4, 2015 · 2 comments
Open

Not able to read HDFS in Map of RMR2 #229

sureshappana opened this issue Dec 4, 2015 · 2 comments

Comments

@sureshappana
Copy link

Hi,
I am trying to access HDFS file in Map function of RMR. (The file is of type cdf.) I am using the following approach but not able to succeed in it.

Normal approach in R(without using mapreduce):
d <- open.ncdf("file.cdf")

This refers to local file.

Appoach I am trying in RMR:

x=hdfs.file("file.cdf")
d<- open.ncdf(x) #We will write this function call in map function

Error: No file found with the specified name. (I even tried by giving absolute path)

I am replacing local file reference with HDFS reference. (I can't use hdfs.read.text.file because my file is not in text format)

So could any one help me if there is anyway to refer the HDFS file (other than text file)?

(P.S: I can't use form.dfs also in my map because file is of size ~70MB)

Environment:
R Version - 3.2.2
Rmr-2_3.3.1
Cloudera Quickstart VM 5.5.0

Please let me know if any information required.

Thanks

@RavikiranCK
Copy link

RavikiranCK commented Mar 16, 2017

Hi Suresh ,

Do you able to get any solution for the problem u have mentioned ??
Same problem I'm facing...

Thank you,
Ravikiran C K

@juagarmar
Copy link

juagarmar commented Apr 1, 2017

Hi Suresh ,

check this example, maybe could help you.

#Set up the enviroment#
Sys.setenv(HADOOP_CMD='/usr/bin/hadoop')
Sys.setenv(HADOOP_HOME='/usr/lib/hadoop-0.20-mapreduce')
Sys.setenv(HADOOP_STREAMING='/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.6.0-mr1-cdh5.7.1.jar')
library(rJava)
library(rmr2)
library(rhdfs)
hdfs.init()
#Define the arguments 'x' & 'y'#
table<-read.csv('http://archive.ics.uci.edu/ml/machine-learning-databases/00265/CASP.csv', sep=",")
table<-as.numeric(unlist(table))
table<-matrix(table, ncol=10)
X1<-to.dfs(table)

good luck

Regards

Juan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants