1

I'm working on a function where I have to store numbers received via uploading an excel file. It's written in Java, using apache poi library, under Spring framework (which is irrelevant).

The file which I'm trying to upload (Note that the column has already been set to Text) :

enter image description here

Code is as follow :

// function accepts "MultipartFile inputFile"
InputStream is = inputFile.getInputStream();
StreamingReader reader = StreamingReader.builder().rowCacheSize(100).bufferSize(4096).sheetIndex(0)
                        .read(is);
for (Row row : reader) {
    System.out.println("Reading next row.");
    System.out.println("row[0] is of type " + row.getCell(0).getCellType());
    Cell cell = row.getCell(0);
    String value = "";
    if (cell.getCellType() == Cell.CELL_TYPE_STRING) {
        value = cell.getStringCellValue().replaceAll("[\\D]", "");
    } else if (cell.getCellType() == Cell.CELL_TYPE_NUMERIC) {
        value = NumberToTextConverter.toText(cell.getNumericCellValue());
    }
    System.out.println("Value is " + value);
}

And I'm getting below output :

Reading next row.
row[0] is of type 0 // Which is equals to Cell.CELL_TYPE_NUMERIC
Value is 166609999

Issue is that I'd require to read it as '0166609999' instead of '166609999', strangely this only happens to xlsx file, if I save it as xls format and re-upload the file, I have no issue with detecting its cell type. Any ideas?

Edited for being marked as duplication:

  • Given answer https://stackoverflow.com/a/19401902/1131470 uses DataFormatter class which is not supported when we are streaming the sheet, as the Cell object retrieved is going to be a StreamingCell object, which throws an exception if we make a call to getCellStyle() function.

As of 29th Mar 2016

Seems that the class StreamingCell does not support DataFormatter, where DataFormatter is the only class available at the moment to grab what Excel displays. So current solution would be sticking back to reading the whole Excel file into memory. If anyone found an answer in future you may please post an answer here, I'd greatly appreciate that, as current solution is an absolutely terrible one.

As of 31st Mar 2016

Special thanks to Axel who pointed out its library version issue, updating streamer jar file to 0.2.12 solves the problem. Thanks!

4
  • I suspect it may lie at the github.com/monitorjbl/excel-streaming-reader. So make sure you are using the latest version of this. Commented Mar 28, 2016 at 10:25
  • @AxelRichter Thank you so much for pointing it out, yes when I look into the class of StreamingCell github.com/monitorjbl/excel-streaming-reader/blob/master/src/…, it seems that most functions throw UnsupportedException. No choice but I'd have to go back to reading the entire file while limiting upload size. Commented Mar 29, 2016 at 8:42
  • What you say can't be true. If the cell is formatted as Text, then with OpenXML the cell's type is t="s" and the cell's value points to the sharedStrings.xml. So row.getCell(0).getCellType() cannot be 0 and the value cannot be 166609647. This only can be if the cell is not formatted as Text. Then, and only then, you would need a formatter. Commented Mar 29, 2016 at 13:56
  • I haven't go into extracting it as a zip and look into the Xml files yet. But I've hosted the file at dropbox.com/s/hjz0dletdq3qk1i/test_formatted_number.xlsx?dl=0 which you may grab and have a look at the type, of you wish to, appreciate your help =) Commented Mar 30, 2016 at 5:13

2 Answers 2

3

Downloaded your file. Downloaded xlsx-streamer-0.2.12.jar, slf4j-api-1.7.20.jar and slf4j-nop-1.7.20.jar and placed in class path.

Having following code:

import com.monitorjbl.xlsx.*;
import org.apache.poi.ss.usermodel.*;

import java.io.*;

class StreamingReaderTest {

 public static void main(String[] args) throws Exception {

  try (
   InputStream is = new FileInputStream(new File("/home/axel/Downloads/test_formatted_number.xlsx"));
   StreamingReader reader = StreamingReader.builder()
           .rowCacheSize(100)
           .bufferSize(4096)
           .sheetIndex(0)
           .read(is);
  ) {
   for (Row row : reader) {
    System.out.println("row[0] is of type " + row.getCell(0).getCellType());
    Cell cell = row.getCell(0);
    String value = "";
    if (cell.getCellType() == Cell.CELL_TYPE_STRING) {
        value = cell.getStringCellValue();
    } else if (cell.getCellType() == Cell.CELL_TYPE_NUMERIC) {
        value = "" + cell.getNumericCellValue();
    }
    System.out.println("Value is " + value);
   }
  }
 }
}

Gives me:

enter image description here

So no problem here.

If I use DataFormatter with numeric cell values:

import com.monitorjbl.xlsx.*;
import org.apache.poi.ss.usermodel.*;

import java.io.*;

class StreamingReaderTest {

 public static void main(String[] args) throws Exception {

  try (
   InputStream is = new FileInputStream(new File("/home/axel/Downloads/test_formatted_number.xlsx"));
   StreamingReader reader = StreamingReader.builder()
           .rowCacheSize(100)
           .bufferSize(4096)
           .sheetIndex(0)
           .read(is);
  ) {
   for (Row row : reader) {
    System.out.println("row[0] is of type " + row.getCell(0).getCellType());
    Cell cell = row.getCell(0);
    String value = "";
    if (cell.getCellType() == Cell.CELL_TYPE_STRING) {
        value = cell.getStringCellValue();
    } else if (cell.getCellType() == Cell.CELL_TYPE_NUMERIC) {
        value = new DataFormatter().formatCellValue(cell);
    }
    System.out.println("Value is " + value);
   }
  }
 }
}

and put the number 166609647 in A2 formatted as 0000000000. then I get:

enter image description here

Sign up to request clarification or add additional context in comments.

2 Comments

Wow, absolutely awesome, let me try again later with your version of streamer jar file, hope it'd solves my issue..
It was indeed version issue. My project was using an outdated 0.2.3 version, updating it solves my problem instantly, thanks!
0

Why .replaceAll("[\D]", "");

Works fine for me, just tried. Wonder what the issue is here.

fis = new FileInputStream(inputFile);
XSSFWorkbook inputWorkBook = new XSSFWorkbook (fis);
XSSFSheet inputSheet = inputWorkBook.getSheetAt(0);
Iterator<Row> rowIterator = inputSheet.iterator();

while(rowIterator.hasNext())
{
    Row row = rowIterator.next();
    Iterator<Cell> cellIterator = row.cellIterator();

    while (cellIterator.hasNext()) 
    {
        Cell cell = cellIterator.next();

        switch (cell.getCellType())
        {
            case Cell.CELL_TYPE_STRING:
                System.out.print(cell.getStringCellValue() + "\t");
                break;
            case Cell.CELL_TYPE_NUMERIC:
                System.out.print(cell.getNumericCellValue() + "\t");
                break;
            case Cell.CELL_TYPE_BOOLEAN:
                System.out.print(cell.getBooleanCellValue() + "\t");
                break;
            default :  
        }
    }
    System.out.println(""); 
}  

3 Comments

Er, thanks for pointing out the weird stuffs =) but anyway it wouldn't affect what my question asks, hence I'll just leave it at the moment..
Hi there, thanks again for the effort, and sorry for not being very clear on the question. If I'm accepting a XLS file, I'd use the similar approach as yours to process the file, and when I do that, '0166609647' appears alright. Problem happens when I accept XLSX file, which could be extremely large, hence streaming is preferred instead of reading the whole file into memory, and that's where issue happens.
I've added the portion on how I declare the streaming object up on my question, you have have a look again there.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.