Error when running a basic Hadoop code

Question

I am running a hadoop code that has a partitioner class inside the job. But, when I run the command

hadoop jar Sort.jar SecondarySort inputdir outputdir

I am getting a runtime error that says

class KeyPartitioner not org.apache.hadoop.mapred.Partitioner.

I have ensured that the KeyPartitioner class has extended the Partitioner class, but why am I getting this error?

Here is the driver code:

JobConf conf = new JobConf(getConf(), SecondarySort.class);
    conf.setJobName(SecondarySort.class.getName());

    conf.setJarByClass(SecondarySort.class);

    conf.setInputFormat(TextInputFormat.class);
    conf.setOutputFormat(TextOutputFormat.class);

    conf.setMapOutputKeyClass(StockKey.class);
    conf.setMapOutputValueClass(Text.class);

    conf.setPartitionerClass((Class<? extends Partitioner<StockKey, DoubleWritable>>) KeyPartitioner.class);

    conf.setMapperClass((Class<? extends Mapper<LongWritable, Text, StockKey, DoubleWritable>>) StockMapper.class);
    conf.setReducerClass((Class<? extends Reducer<StockKey, DoubleWritable, Text, Text>>) StockReducer.class);

and here is the code of the partitioner class:

public class KeyPartitioner extends Partitioner<StockKey, Text> {

@Override
public int getPartition(StockKey arg0, Text arg1, int arg2) {

    int partition = arg0.name.hashCode() % arg2;

    return partition;
}
}

Paste the code - without the code it would be a wild guess only. — Praveen Sripati
– Praveen Sripati, Commented Jul 8, 2012 at 14:13
Thanks. The import statements are changed accordingly to @Tudor 's answer below. A new error pops up now saying -> output directory not set in JobConf. — London guy
– London guy, Commented Jul 8, 2012 at 14:37

Tudor · Accepted Answer · 2012-07-08 15:11:35Z

1

Notice that there are two partitioners in hadoop:

org.apache.hadoop.mapreduce.Partitioner
org.apache.hadoop.mapred.Partitioner

Make sure your KeyPartitioner class implements the second interface, not the first abstract class.

Edit: You have to set the input and output folders:

FileInputFormat.addInputPath(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));

edited Jul 8, 2012 at 15:11

answered Jul 8, 2012 at 14:19

Tudor

62.6k13 gold badges105 silver badges148 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

London guy Over a year ago

Thanks! Changed the import statements correctly. A new error comes now: -> output directory not set in JobConf.

Praveen Sripati Over a year ago

In the new Hadoop - The Definitive Guide there is an entire section on the new API and the examples in the book are also based on the new API. Would suggest to get the book.

Collectives™ on Stack Overflow

Error when running a basic Hadoop code

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related