0

i am new to java and mapreduce. I have written mapreduce program to perform wordcount. I am facing the below error.

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
    at mapreduce.mrunit.Wordcount.main(Wordcount.java:63)

and the 63 line code is :

FileInputFormat.setInputPaths(job, new Path(args[0]));

Below is the code which i have written:

package mapreduce.mrunit;
import java.util.StringTokenizer;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class Wordcount {
    public static class Map extends
            Mapper<LongWritable, Text, Text, IntWritable> {
        private final static IntWritable one = new IntWritable(1);
        private Text word = new Text();

        public void map(LongWritable key, Text value, Context context)
                throws IOException, InterruptedException {
            String line = value.toString();
            StringTokenizer tokenizer = new StringTokenizer(line);
            while (tokenizer.hasMoreTokens()) {
                word.set(tokenizer.nextToken());
                context.write(word, one);
            }
        }
    }
    public static class Reduce extends
            Reducer<Text, IntWritable, Text, IntWritable> {
        public void reduce(Text key, Iterable<IntWritable> values,
                Context context) throws IOException, InterruptedException {
            int sum = 0;
            for (IntWritable val : values) {
                sum += val.get();
            }
            context.write(key, new IntWritable(sum));
        }
    }
    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();

        @SuppressWarnings("deprecation")
        Job job = new Job(conf, "wordcount");

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);

        job.setMapperClass(Map.class);
        job.setReducerClass(Reduce.class);

        // job.setInputFormatClass(TextInputFormat.class);
        // job.setOutputFormatClass(TextOutputFormat.class);

        FileInputFormat.setInputPaths(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        job.waitForCompletion(true);
    }
}

I am unable to fix the error. Please help me out to fix this error.

2
  • What is args[0]? Commented Mar 15, 2018 at 14:22
  • I have added details around why you are getting this error and how you should proceed. Commented Mar 15, 2018 at 16:03

2 Answers 2

1

The error is at below line in the main() method:

FileInputFormat.setInputPaths(job, new Path(args[0]));

From the Javadoc, this exception is thrown when

Thrown to indicate that an array has been accessed with an illegal index. The index is either negative or greater than or equal to the size of the array.

It means, the length of array args parameter to main() method is missing elements.

As per your program, you are expecting it to of 2 elements where

First element args[0] is input paths.

Second element args[1] is output paths.

Please create a input directory and put a text file with some lines. Note that you should not create a output directory (you may create upto parent directory). MapReduce will automatically create it.

So, assuming your paths to be

inputPath = /user/cloudera/wordcount/input
outputPath = /user/cloudera/wordcount

Then execute the program like

hadoop jar wordcount.jar mapreduce.mrunit.Wordcount /user/cloudera/wordcount/input /user/cloudera/wordcount/output

Note that I have added output folder in the 2nd parameter of the program to honor the restriction that the output path should not exist, it would be created by the program at runtime.

Finally, I may suggest to follow this tutorial, which has step wise instruction to execute WordCount program.

Sign up to request clarification or add additional context in comments.

Comments

1

How did you run it? The error shows that you did not put arguments when you run the job. You have to insert both input and out path in the arguments like below:

hadoop jar MyProgram.jar /path/to/input /path/to/output

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.