I was trying to write a mapreduce code in java.So here are my files.
mapper class(bmapper):
public class bmapper extends Mapper<LongWritable,Text,Text,NullWritable>{
private String txt=new String();
public void mapper(LongWritable key,Text value,Context context)
throws IOException, InterruptedException{
String str =value.toString();
int index1 = str.indexOf("TABLE OF CONTENTS");
int index2 = str.indexOf("</table>");
int index3 = str.indexOf("MANAGEMENT'S DISCUSSION AND ANALYSIS");
if(index1 == -1)
{ txt ="nil";
}
else
{
if(index1<index3 && index2>index3)
{
int index4 = index3+ 109;
int pageno =str.charAt(index4);
String[] pages =str.split("<page>");
txt = pages[pageno+1];
}
else
{
txt ="nil";
}
}
context.write(new Text(txt), NullWritable.get());
}
}
reducer class(breducer):
public class breducer extends Reducer<Text,NullWritable,Text,NullWritable>{
public void reducer(Text key,NullWritable value,Context context) throws IOException,InterruptedException{
context.write(key, value);
}
}
driver class (bdriver):
public class bdriver {
public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {
Configuration conf = new Configuration();
Job job = new Job(conf);
job.setJobName("black coffer");
job.setJarByClass(bdriver.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(NullWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(NullWritable.class);
job.setReducerClass(breducer.class);
job.setMapperClass(bmapper.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.setInputPaths(job, new Path[]{new Path(args[0])});
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}
}
`
I am getting following error.
[training@localhost ~]$ hadoop jar blackcoffer.jar com.test.bdriver /page1.txt /MROUT4
18/03/16 04:38:56 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
18/03/16 04:38:57 INFO input.FileInputFormat: Total input paths to process : 1
18/03/16 04:38:57 WARN snappy.LoadSnappy: Snappy native library is available
18/03/16 04:38:57 INFO util.NativeCodeLoader: Loaded the native-hadoop library
18/03/16 04:38:57 INFO snappy.LoadSnappy: Snappy native library loaded
18/03/16 04:38:57 INFO mapred.JobClient: Running job: job_201803151041_0007
18/03/16 04:38:58 INFO mapred.JobClient: map 0% reduce 0%
18/03/16 04:39:03 INFO mapred.JobClient: Task Id : attempt_201803151041_0007_m_000000_0, Status : FAILED
java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:871)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:574)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at org.apache.hadoop.mapreduce.Mapper.map(Mapper.java:124)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
I think it is not able to find Mapper and reducer class. I have written the code in main class, It is getting default Mapper and reducer class
hadoopare you using?