0

I have a CSV file which contains almost 10000 lines of data. I want to split that file into 10 different CSV file based on the total line count, so that each file can contain 1000 lines of data in the order first file should have 1-1000 lines, second file should have 1001-2000 lines and so on. Also, each of those 10 different CSV file should only contain the data from the first column of the parent CSV file. The code which I developed writes the same data (1.e 1-1000 lines) to all of the 10 csv files. I am unable to figure out what is the mistake in the code.

for (int j=1;j<=files;j++){  

   String inputfile = "C:/Users/Downloads/File.csv";
   BufferedReader br = new BufferedReader(new FileReader(inputfile)); 
   FileWriter fstream1 = new FileWriter("C:/Users/Downloads/FileNumber_"+j+".csv");       
   BufferedWriter out = new BufferedWriter(fstream1);  

   String strLine = null; 

   for (i=i+1;i<=(j*lines);i++) {   //I Have declared i as static int i = 0 and have already calculated lines and files in other part of code

    strLine = br.readLine();   
    if (strLine!= null) { 

        String strar[] = strLine.split(",");
        out.write(strar[0]);   
        if(i!=(j*lines)) {  
            out.newLine(); }  
    }
   }


   out.close();   
3
  • Try to reduce your code to a MVCE that replicates the problem. Solve small, then implement big. It's hard to answer your question by looking at a couple of pages of badly formatted code. On that note, format your code nicely in your editor, then add it to your question. It's much easier to understand when you use proper indentation and whitespace. Commented Jul 11, 2017 at 5:46
  • Thank you Matt for your suggestion. This is my very first question on the forum so I was unaware of that. I will keep in it my mind from the next time onwards. Commented Jul 11, 2017 at 5:57
  • @New2Java Please try my code and please replace variables with your values like lines and path of the file. Commented Jul 11, 2017 at 7:04

4 Answers 4

2
Use this code 

import java.io.*;  
import java.util.Scanner;  
public class csvfilesplit
{
    public static void main(String[] args) throws IOException {
        int split;      
        Scanner reader = new Scanner(System.in);  // Reading from System.in
        System.out.println("\n Enter The count to split each file :-----------");
        int  s = reader.nextInt();
        File folder = new File("file/");                                //*** Location of your file 
        int filecount = 0;
            for (File fo :
            folder.listFiles()) {
                    if (fo.isFile()) {
                            filecount++;}
                        }
        System.out.println("Total source file count is :-----------------------    "+filecount+"\n");  //*** Total numbr of orginal file in mentioned folder
        String path = folder.getAbsolutePath();
       // System.out.println("location=-----------"+path);
        File[] listOfFiles = folder.listFiles();
        for (int l = 0; l < listOfFiles.length; l++) {
         if (listOfFiles[l].isFile()) {
           System.out.println("File name Is :--------------------------   " + listOfFiles[l].getName());  //*** File name
            BufferedReader bufferedReader = new BufferedReader(new FileReader(path+"/"+listOfFiles[l].getName()));   // Read a souce file
            String input;
            int count = 0;
            while((input = bufferedReader.readLine()) != null)
            {
             count++;
            }  
     System.out.println("File total rows count is :--------------   "+count);   //*** Number of row count in the file
     split= count/s;

     int n = split,z=0;
     if(n!=z)
     {
      System.out.println("Each splitted file line count is :------   "+split+"\n"); //*** After splitted  file have the rows count
      FileInputStream fstream = new FileInputStream(path+"/"+listOfFiles[l].getName()); DataInputStream in = new DataInputStream(fstream);  
      BufferedReader br = new BufferedReader(new InputStreamReader(in)); String strLine;  
      for (int j=1;j<=s;j++)  
       {  
        File dir = new File(path+"/"+"CSV_DATA_"+j);
        dir.mkdir(); 
        File filefolder = new File(path+"/"+"CSV_DATA_"+j);
        String folderpath = filefolder.getAbsolutePath();         
        //********Destination File Location******
        FileWriter fstream1 = new FileWriter(folderpath+"/"+listOfFiles[l].getName()+".csv");   //*** Splitted files  and file format(.txt/csv.....)
        BufferedWriter out = new BufferedWriter(fstream1);   
        for (int i=1;i<=n;i++)  
         {  
         strLine = br.readLine();   
         if (strLine!= null)  
           {  
           out.write(strLine);   
            if(i!=n)  
             {  
             out.newLine();  
             } 
            }     
          }  
          out.close(); 
              } 
  in.close();  
    }  
    else
        {// Below N number of row in this file
            System.out.println("\n******************************* Mentioned this file have below - "+s+" rows   ******************************   "+listOfFiles[l].getName()+" \n");}
       }
}
System.out.println("\n Splitted_CSV_files stored location is :     "+path);
 }
}
Sign up to request clarification or add additional context in comments.

Comments

1

The problem of having same lines in each of 10 csv files is because of the line below in method myFunction

BufferedReader br = new BufferedReader(new FileReader(inputfile));

The logic using variables i,j,lines works perfectly. But every time myFunction is called, br (BufferedReader for input file) is initialized again.

So br.readLine() will start reading from start. And thus having same 1000 lines in each of the 10 csv files.

Hope it helps!

2 Comments

Can you suggest any alternative for that? What shall I use instead of bufferedreader?
Its not the problem of using BufferedReader. Only thing to take care is to initialize bufferedReader for input file outside this loop for (int j=1;j<=files;j++) Before entering to this loop, do the initialization
0

Please find the Below code:-

public static void main(String[] args) throws IOException {
           //first read the file
           String inputfile = "C:/Users/bohrahas/Desktop/SampleCSVFile.csv";
           BufferedReader br = new BufferedReader(new FileReader(inputfile)); 
           //create thje first file which will have 1000 lines
           File file = new File("C:/Users/bohrahas/Desktop/FileNumber_"+1+".csv");
            FileWriter fstream1 = new FileWriter(file);
            BufferedWriter out = new BufferedWriter(fstream1);  
               String line="";
               //count the number of line
               int count=1;
               int file_Number=2;
               while ((line = br.readLine()) != null) 
               {
                   //if the line is divided by 1000 then create a new file with file count
                   if(count % 1000 == 0)
                   {
                       File newFile = new File("C:/Users/bohrahas/Desktop/FileNumber_"+file_Number+".csv");
                       fstream1 = new FileWriter(newFile);
                       file_Number++;
                       out = new BufferedWriter(fstream1); 
                   }
                    if(line.indexOf(",")!=-1)
                    line=line.substring(0, line.indexOf(","));
                    out.write(line);
                    out.newLine();
                   count++;
               }

}

Logic :-

  1. You don't have to read the parent file for every loop. Just load it once i.e create a object once and then process the parent file.

  2. While reading every line of parent get the whole line and just remove the columns except first column.

  3. Till the first "," is always a first column. so remove the string after '',"
  4. if the line traversed count is divided completely by 1000 i.e. 2000,3000,4000....etc. create a new file and create a BufferWriter for that.

6 Comments

Hi @Hasan Ali Bohra...thank you for putting your time for developing this code. I tried your code. It is counting the exact number of lines and files but it is returning me empty csv files. And second thing is, in my case I want to extract only the first column data from the parent file to child files. But your code seems to read the entire document and not restricting to a specific column.
@New2Java thaks for your feed back. Please accept my answer.
Thank you brother.The edited code is working fine. I wanted to learn few things though. 1.What is thought process behind creating the first file before hand outside the loop rather than creating the same within the loop? 2. Can you please explain me the while part of the code? I am finding it difficult to understand what is happening inside that while loop.
please see the edited description.and accept the answer
Hi @Hasan Ali Bohra, just to test the code and the logic to create first file outside the loop, I tried the code with an empty csv file. And it is creating a file i.e fist file. I think it should not create a file in such case. Isn't it?
|
-2

Your logic is very bad here. I rewrote the whole code for you,

import java.io.*;  
import java.util.Scanner;  


public class FileSplit {  

public static void myFunction(int lines, int files) throws FileNotFoundException, IOException{

    String inputfile = "file.csv";
    BufferedReader br = new BufferedReader(new FileReader(inputfile)); //reader for input file intitialized only once
    String strLine = null; 
    for (int i=1;i<=files;i++) { 
        FileWriter fstream1 = new FileWriter("FileNumber_"+i+".csv"); //creating a new file writer.       
        BufferedWriter out = new BufferedWriter(fstream1);  
        for(int j=0;j<lines;j++){   //iterating the reader to read only the first few lines of the csv as defined earlier
             strLine = br.readLine();   
            if (strLine!= null) { 
               String strar[] = strLine.split(",");
               out.write(strar[0]);   //acquring the first column
               out.newLine();   
            } 
        }
        out.close(); 
        }  
   }

public static void main(String args[])  
{  
 try{  
     int lines = 2;  //set this to whatever number of lines you need in each file
     int count = 0;
     String inputfile = "file.csv";
     File file = new File(inputfile);  
     Scanner scanner = new Scanner(file);  
     while (scanner.hasNextLine()) {  //counting the lines in the input file
        scanner.nextLine();  
        count++;  
      }  
     System.out.println(count);
     int files=0;  
     if((count%lines)==0){  
        files= (count/lines);  
      }  
      else{  
         files=(count/lines)+1;  
      }   
      System.out.println(files); //number of files that shall eb created

      myFunction(lines,files);
 }

 catch (FileNotFoundException e) {
       e.printStackTrace();
 }
 catch (IOException e) {
  e.printStackTrace();
 }
}  

}

12 Comments

Hi joey, I implemented your suggestion and changed the loop, but there is no improvement in the output. It is again writting the first 1000 lines of data to each of the 10 files.
See my edit, Please appreciate the effort by accepting the solution.
Your code is a mess. Correct the indentation, and explain what your code does - as it is, it should not get accepted
Hey @Joey Pinto, Thanks alot. Checked the code and it is working fine. Got where I was doing mistake. your edited code is perfectly fine.
@ScaryWombat I was doing just that when you scolded me :p
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.