0

I am using this simple function (taken from here) to export PHP array into simple binary Excel file. Writing binary Excel file was my requirement.

public static function array_to_excel($input) 
{
    $ret = pack('ssssss', 0x809, 0x8, 0x0, 0x10, 0x0, 0x0);
    foreach (array_values($input) as $lineNumber => $row) 
    {
        foreach (array_values($row) as $colNumber => $data) 
        {
            if (is_numeric($data)) 
            {
                $ret .= pack('sssssd', 0x203, 14, $lineNumber, $colNumber, 0x0, $data);
            } 
            else 
            {
                $len = strlen($data);
                $ret .= pack('ssssss', 0x204, 8 + $len, $lineNumber, $colNumber, 0x0, $len) . $data;
            }
        }
    }
    $ret .= pack('ss', 0x0A, 0x00); 
    return $ret;
}

Then to call this is pretty much simple simple:

Model_Utilities::array_to_excel($my_2d_array);

Function itself works great and is super simple to create simple binary PHP file. The problem I have is with UTF-8 characters. I get strange characters like Ä¡ instead of right characters... Is there a way to set character encoding in my to excel function?

6
  • That's because this simple code doesn't have any charset defined, and I'm not going to write a tutorial on how to modify this code to include that because it would simply take too long Commented Jan 8, 2014 at 15:15
  • @MarkBaker yes I have figured that out myself. I was looking around on how to define that but found no reference. Can you point me somwehere? Commented Jan 8, 2014 at 15:34
  • I've spend the best part of 10 years now developing my own Excel reader/writer library for PHP.... that's my pointer (PHPExcel or on github) Commented Jan 8, 2014 at 15:36
  • Good job Mark, I know your library, it's very good but 10000x more then I need. Maybe a quick question. I haven't found in your docs. Can it produce binary Excel 5.0 files? This is unfortunately my requirement. The function above works perfectly for this purpose, just need to fix encoding. Commented Jan 8, 2014 at 15:47
  • It produces BIFF8 files as the standard .xls output, we dropped support for BIFF5 a year or so ago (on the grounds that it was nearly 2 decades old)... but BIFF8 xls files should still be readable in all versions of MS Excel from 97 onwards Commented Jan 8, 2014 at 15:50

1 Answer 1

3

EDIT:

After wading through hundreds of obfuscated Microsoft docs before locating the OpenOffice version of the XLS format spec, I managed to do something.

However, it relies on the BIFF8 format since, as far as I can tell, BIFF5 (the format used by Excel95) has no direct UTF-16 support.

function array_to_excel($input) 
{
    $cells = '';
    foreach (array_values($input) as $lineNumber => $row) 
    {
        foreach (array_values($row) as $colNumber => $data) 
        {
            if (is_numeric($data)) 
            {
                $cells .= pack('sssssd', 0x203, 14, $lineNumber, $colNumber, 0x0, $data);
            } 
            else 
            {
                $data = mb_convert_encoding ($data, "UTF-16LE", "UTF-8");
                $len = mb_strlen($data, "UTF-16LE");
                $cells .= pack('ssssssC', 0x204, 9+2*$len, $lineNumber, $colNumber, 0x0, $len, 0x1).$data;
           }
        }
    }
    return pack('s4', 0x809, 0x0004, 0x0600, // <- this selects BIFF8 format
                      0x10) . $cells . pack('ss', 0x0A, 0x00); 
}

$table = Array (
    Array ("Добрый день", "Bonne journée"),
    Array ("tschüß", "こんにちは。"),
    Array (30, 40));
    
$xls = array_to_excel($table);
file_put_contents ("sample.xls", $xls);

My (French) PC version of Excel 2007 managed to open the sample file in compatibility mode, Russian and Japanese included. There is no telling how this hack would work on other variants, though.

EDIT (bis) : from the file specs linked above:

Byte Strings (BIFF2-BIFF5)

All Excel file formats up to BIFF5 contain simple byte strings. The byte string consists of the length of the string followed by the character array. The length is stored either as 8bit value or as 16bit value, depending on the current record. The string is not zero-terminated. The encoding of the character array is dependent on the current record.

Record LABEL, BIFF3-BIFF5:

Offset Size Contents
0 2 Index to row
2 2 Index to column
4 2 Index to XF record
6 var. Byte string, 16-bit string length

Unless you generate a much more complex file, I'm afraid BIFF5 is a no go.

Sign up to request clarification or add additional context in comments.

5 Comments

The only difference is that now I get ? instead of strange characters mentioned above ...
@PrimozRome Maybe that might suit your needs.
yes this works, hopefully BIFF8 format will be ok for what we need this.
Your solution worked, but BIFF8 format is not accepted where I need this export. So I still need to find UTF-8 compatible solution for BIFF5 format.
See my edit. The LABEL record definetly won't be the solution, and I fear you'll have to generate a much more complex file to work around this. For now, I have no idea what other means could be used to coax UTF-8 chars into a BIFF5 record that would eventually end up inside a cell.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.