Error while executing the mapreduce program

Question

i am new to java and mapreduce. I have written mapreduce program to perform wordcount. I am facing the below error.

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
    at mapreduce.mrunit.Wordcount.main(Wordcount.java:63)

and the 63 line code is :

FileInputFormat.setInputPaths(job, new Path(args[0]));

Below is the code which i have written:

package mapreduce.mrunit;
import java.util.StringTokenizer;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class Wordcount {
    public static class Map extends
            Mapper<LongWritable, Text, Text, IntWritable> {
        private final static IntWritable one = new IntWritable(1);
        private Text word = new Text();

        public void map(LongWritable key, Text value, Context context)
                throws IOException, InterruptedException {
            String line = value.toString();
            StringTokenizer tokenizer = new StringTokenizer(line);
            while (tokenizer.hasMoreTokens()) {
                word.set(tokenizer.nextToken());
                context.write(word, one);
            }
        }
    }
    public static class Reduce extends
            Reducer<Text, IntWritable, Text, IntWritable> {
        public void reduce(Text key, Iterable<IntWritable> values,
                Context context) throws IOException, InterruptedException {
            int sum = 0;
            for (IntWritable val : values) {
                sum += val.get();
            }
            context.write(key, new IntWritable(sum));
        }
    }
    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();

        @SuppressWarnings("deprecation")
        Job job = new Job(conf, "wordcount");

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);

        job.setMapperClass(Map.class);
        job.setReducerClass(Reduce.class);

        // job.setInputFormatClass(TextInputFormat.class);
        // job.setOutputFormatClass(TextOutputFormat.class);

        FileInputFormat.setInputPaths(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        job.waitForCompletion(true);
    }
}

I am unable to fix the error. Please help me out to fix this error.

I have added details around why you are getting this error and how you should proceed. — Gyanendra Dwivedi
– Gyanendra Dwivedi, Commented Mar 15, 2018 at 16:03

Gyanendra Dwivedi · Accepted Answer · 2018-03-15 16:02:34Z

The error is at below line in the main() method:

FileInputFormat.setInputPaths(job, new Path(args[0]));

From the Javadoc, this exception is thrown when

Thrown to indicate that an array has been accessed with an illegal index. The index is either negative or greater than or equal to the size of the array.

It means, the length of array args parameter to main() method is missing elements.

As per your program, you are expecting it to of 2 elements where

First element args[0] is input paths.

Second element args[1] is output paths.

Please create a input directory and put a text file with some lines. Note that you should not create a output directory (you may create upto parent directory). MapReduce will automatically create it.

So, assuming your paths to be

inputPath = /user/cloudera/wordcount/input
outputPath = /user/cloudera/wordcount

Then execute the program like

hadoop jar wordcount.jar mapreduce.mrunit.Wordcount /user/cloudera/wordcount/input /user/cloudera/wordcount/output

Note that I have added output folder in the 2nd parameter of the program to honor the restriction that the output path should not exist, it would be created by the program at runtime.

Finally, I may suggest to follow this tutorial, which has step wise instruction to execute WordCount program.

Mobin Ranjbar · Accepted Answer · 2018-03-15 15:04:50Z

1

How did you run it? The error shows that you did not put arguments when you run the job. You have to insert both input and out path in the arguments like below:

hadoop jar MyProgram.jar /path/to/input /path/to/output

answered Mar 15, 2018 at 15:04

Mobin Ranjbar

1,3601 gold badge14 silver badges26 bronze badges

Collectives™ on Stack Overflow

Error while executing the mapreduce program

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related