Assume Employee is a Java Class.
I have a JavaRDD<Employee[]> arrayOfEmpList, i.e, each RDD has an array of employees.
Out of this, I want to create a single list of employees, something like
JavaRDD<Employee>
This is what i tried:
Created a List<Employee> empList = new ArrayList<Employee>();
then foreach RDD of Employee[]:
arrayOfEmpList.forEach(new VoidFunction<Employee[]>(){
public void call(Employee[] arg0){
empList.addAll(Arrays.asList(arg0));
System.out.println(empList.size()); //prints correct values incrementally
}
});
System.out.println(empList.size()); //gives 0
I am not able to get the size outside foreach loop.
Is there some other way to achieve this?
P.S: i want to have all employee records as separate RDD, so 1st employee list may contain 10 records, 2nd may contain 100 records, 3rd may contain 200 records. i want a final list of 330 records, which i can then parallelize and perform actions upon.