Answer the question
In order to leave comments, you need to log in
How to perform a mapreduce task using hadoop-streaming?
I created a cluster in google cloud.
this is how I am trying to do a mapreduce task
[email protected]:~$ $HADOOP_HOME/bin/hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar \
> -D mapred.map.tasks=1 \
> -D mapred.reduce.tasks=1 \
> -input /movies.csv \
> -output /result \
> -file ~/mapreduce_hadoop/homework5/mapreduce_hadoop/mapper.py ~/mapreduce_hadoop/homework5/mapreduce_hadoop/reducer.py \
> -mapper "python mapper.py" -reducer "python reducer.py"
JAR does not exist or is not a normal file: /usr/lib/hadoop-mapreduce/hadoop-streaming.jar
[email protected]:~$ hdfs dfs -ls /
Found 4 items
-rw-r--r-- 2 denislysenko0001 hadoop 484688 2021-11-27 18:58 /movies.csv
drwxrwxrwt - hdfs hadoop 0 2021-11-26 20:30 /tmp
drwxrwxrwt - hdfs hadoop 0 2021-11-27 15:15 /user
drwx-wx-wx - hive hadoop 0 2021-11-26 20:30 /var
[email protected]:~$
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question