What you appear to have is a CSV format type text file where there are specific rows of comma delimited columnar data. However usually a CSV file will start with a Header line indicating the Column Names (not always the case though) which you appear to not need so we can ignore that part of it.
I think an ideal situation in this particular case would be a method that reads the text file and can retrieve either all or specific columns of data from each row as that file is read. The retrieved data is then written to a supplied output file.
A slight problem though is that some of the columnar data is within quotation marks ("...") some of which also contains the very same delimiter that separates all other columns within any given record row. This can pose a problem so care must be taken to handle this situation when retrieving data otherwise incomplete data could be acquired and written to the desired output file and returned within the 2D String Array.
The code example I provide below does all this within a single method. It is relatively basic therefore it would be up to you had deal with any specific enhancements if so required. The method contains three parameters, two of type String and one of optional int args[] and it returns a Two Dimensional String Array containing the retrieved data. If you don't want the method to return anything then the code can be somewhat reduced.
Here is the getFromCSV() method. It is well commented:
/**
* This is a very basic parsing type method.<br><br>
*
* Usage: {@code String[][] data = getFromCSV("Data.txt", "DataOutput.txt", 13, 16, 17, 28, 29); }
*
* @param csvFilePath (String) The full path and file name of the Data file.<br>
*
* @param destinationPath (String) The full path and file name of the desired output file.
* the retrieved data will be store there.<br>
*
* @param desiredLiteralColumns (Optional - Integer Args or int[] Array) The literal
* data columns to acquire row data from. The arguments can be provided in any desired
* order. The returned Array will hold the required data in the order your provided.<br>
*
* @return (2D String Array) Containing columnar data from each data row.
*/
public static String[][] getFromCSV(String csvFilePath, String destinationPath,
int... desiredLiteralColumns) {
String ls = System.lineSeparator(); // The Line-Separator used for current OS.
/* Does the destination Path exist?
If not create it before file is created. */
File destPath = new File(destinationPath);
if (!destinationPath.trim().equals("") && destPath.getParentFile() == null) {
String fPath = destPath.getAbsolutePath().substring(0, destPath.getAbsolutePath().lastIndexOf("\\"));
new File(fPath).mkdirs();
}
else {
destPath.getParentFile().mkdirs();
}
ArrayList<String[]> list = new ArrayList<>();
ArrayList<String> lineData = new ArrayList<>();
File cisStaffHours = new File(csvFilePath);
// 'Try With Resources' is used here to auto-close the reader.
try (Scanner reader = new Scanner(cisStaffHours)) {
String fileLine = "";
// 'Try With Resources' is used here to auto-close the writer.
try (PrintWriter writer = new PrintWriter(new FileWriter(destPath))) {
while (reader.hasNextLine()) {
/* Read lines one at a time. Trim each read in
line of leading or trailing white-spaces (if any). */
fileLine = reader.nextLine().trim();
// Skip blank lines (if any).
if (fileLine.equals("")) {
continue;
}
/* Split the line based on a comma (,) delimiter)...
(DO NOT split on commas within quotation marks!).
The regular expression used with the split() method
ignores any number of white-spaces before or after
the delimiter. */
String[] lineParts = fileLine.split("\\s{0,},\\s{0,}(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)", -1);
//Do we only want specific columns only?
if (desiredLiteralColumns.length > 0) {
// Yes...
lineData.clear(); // Clear the ArrayList in case it already contains something.
// Retrieve the desired columns an place them into a String ArrayList...
for (int dc : desiredLiteralColumns) {
for (int lp = 0; lp < lineParts.length; lp++) {
if (dc == (lp + 1)) {
lineData.add(lineParts[lp]);
break;
}
}
}
/* Convert the 'lineData' ArrayList to a 1D String Array
and then add that String Array to the 'list' ArrayList. */
list.add(lineData.toArray(new String[0]));
// Build and Write the acquired data to the desired output file.
String dataString = lineData.get(0).replace("\"", "") + ", " +
lineData.get(1) + " " + lineData.get(2) + " , " +
lineData.get(3).replace(".", " ") + lineData.get(4);
writer.println(dataString);
writer.flush();
}
else {
// No, we want all columns. Add all columnar data to the ArrayList...
list.add(lineParts);
// Build and Write the acquired data to the desired output file.
String dataString = lineData.get(0).replace("\"", "") + ", " +
lineData.get(1) + " " + lineData.get(2) + " , " +
lineData.get(3).replace(".", " ") + lineData.get(4);
writer.println(dataString);
writer.flush();
}
}
}
// Catch and display any exceptions,
catch (IOException ex) {
System.out.println("getFromCSV() Method Error!" + ls + ex.getMessage());
}
}
catch (FileNotFoundException ex) {
System.out.println("getFromCSV() Method Error!" + ls + ex.getMessage());
}
/* Convert list to a 2D String Array and then
return the 2D Array... */
String[][] array = new String[list.size()][];
for (int i = 0; i < list.size(); i++) {
array[i] = list.get(i);
}
return array;
}
As you can see the method requires three parameters:
The csvFilePath parmeter:
A string argument must be supplied here which indicates where the text file to be read is located within the local file system. If the text file is located within the Class-Path then just the file name should suffice. If not then the full path and file name is expected.
The destinationPath parmeter:
A string argument must be supplied here which indicates where the output text file is to be created and written to within the local file system. If the output file is to be located within the application's project folder then just the file name should suffice. If not then the full path and file name to it desired location is expected. Make sure permissions exist within your Operating System for this to be achievable. If the supplied destination path doesn't already exist within the local file system then it is automatically created, again, make sure permissions exist within your Operating System for this to be achievable.
The desiredLiteralColumns parameter:
Either an integer array (int[ ]) can be supplied here or a series of comma delimited integer arguments which denote the desired literal columns to retrieve data from in each file data row. By "literal" we mean that the data located at Column Index 0 is literally Column 1. The data in Column Index 7 is literally Column 8. It is the literal values you want to supply. Here is an quick example:
If I have a data row in file which looks like:
"Doe, John", 62, "6558 Cook Road, Atlanta, Georgia", 30336, $78,564.77
and we want to retrieve the data in 1st column (persons name), the 3rd column (Address), and the 4th column (postal code) then we could supply the following to the getFromCSV() method:
String[][] myData = getFromCSV("My_CSV_File.csv", "MY_Output_File.txt", 1, 3, 5);
O R
int[] columns = {1, 3, 5};
String[][] myData = getFromCSV("C:\\MyDataFile\\My_CSV_File.csv",
"C:\\MyOuputFiles\\MY_Output_File.txt",
columns);
Then when the code is run the output file and the returned 2D String Array will contain:
"Doe, John", "6558 Cook Road, Atlanta, Georgia", 30336
If no arguments are supplied to the optional desiredLiteralColumns parameter then all Columnar data is retrieved, so:
String[][] myData = getFromCSV("My_CSV_File.csv", "MY_Output_File.txt");
will place the following into the output file and the returned 2D String Array will contain the very same.
"Doe, John", 62, "6558 Cook Road, Atlanta, Georgia", 30336, $78,564.77
I believe that there are delimiter positioning issues within the data lines you supplied in your post as examples. I think you are missing some commas. Look it over carefully. Once you've done that....To build exactly what you need you would do something like this:
String[][] data = getFromCSV("StaffHoursOverviewReport_10102019 (1).txt",
"outFile.txt",
13, 16, 17, 28, 29);
for (int i = 0; i < data.length; i++) {
String dataString = data[i][0].replace("\"", "") + ", " +
data[i][1] + " " + data[i][2] + " , " +
data[i][3].replace(".", " ") + data[i][4];
System.out.println(dataString);
}
This should ouput to the console window and be placed within your desired output file:
Smith, Lehron, Billable 4.10 , Non Bill 2.57
which is exactly like the example you provided for the desired output. Tested!