1

my csv file link : https://drive.google.com/file/d/0B-Z58iD3By5wb2R2TnV0Rjc3Zzg/view

I already read many reference but I can not seperate my csv by "," (the delimiter not working properly). Is there a solution how could I get an array like this from that csv file:

`Array[0]=>
(

['username'] => Lexsa,

['date'] => 12/07/2017,

['retweet'] => null,

)`

`Array[1]=>
 (

 ['username'] => any,
 ['date'] => 12/07/2017,

 ['retweet'] => null
 )`



function csv_to_array($filename='', $delimiter=',')
{
if(!file_exists($filename) || !is_readable($filename))
    return FALSE;

$header = NULL;
$data = array();
if (($handle = fopen($filename, 'r')) !== FALSE)
{
    while (($row = fgetcsv($handle, 1000, $delimiter)) !== FALSE)
    {
        if(!$header)
            $header = $row;
        else
            $data[] = array_combine($header, $row);
    }
    fclose($handle);
}
return $data;
} 

I try to use many reference but the result is always like this the code wont split the line with "," :

Array ( [0] => Array (["username","date","retweets","favorites","text","geo","mentions","hashtags","id","permalink"] => "Lexsa911","01/12/2016 0:05",0.0,0.0,"Kecelakaan - Kecelakaan Pesawat yang Melibatkan Klub-Klub Sepakbola http:// ht.ly/1IdL306EzDH",,,,"8,04E+17","https://twitter.com/Lexsa911/status/804008435020865536" )

3
  • 1
    This probably shouldn't be tagged as python. Commented Jul 14, 2017 at 4:50
  • 1
    ah sorry i'm new to stackoverflow but i like to search reference in here. i will edit it. Commented Jul 14, 2017 at 4:52
  • First point is to see if you can alter how this file is generated as it seems to be full of ". Commented Jul 14, 2017 at 5:57

2 Answers 2

2

This is what I get when I open your tes.csv with less or gedit:

"""username"",""date"",""retweets"",""favorites"",""text"",""geo"",""mentions"",""hashtags"",""id"",""permalink"""
"""Lexsa911"",""01/12/2016 0:05"",0.0,0.0,""Kecelakaan - Kecelakaan Pesawat yang Melibatkan Klub-Klub Sepakbola http:// ht.ly/1IdL306EzDH"",,,,""8,04E+17"",""https://twitter.com/Lexsa911/status/804008435020865536"""
"""Widya_Davy"",""01/12/2016 0:05"",0.0,0.0,""Kecelakaan - Kecelakaan Pesawat yang Melibatkan Klub-Klub Sepakbola http:// ow.ly/h1Eh306EzHk"",,,,""8,04E+17"",""https://twitter.com/Widya_Davy/status/804008434588876803"""
"""redaksi18"",""01/12/2016 0:05"",0.0,0.0,""Klub Brasil Korban Kecelakaan Pesawat Didaulat Jadi Juara http:// beritanusa.com/index.php?opti on=com_content&view=article&id=39769:klub-brasil-korban-kecelakaan-pesawat-didaulat-jadi-juara&catid=43:liga-lain&Itemid=112 … pic.twitter.com/1K7OlZSX83"",,,,""8,04E+17"",""https://twitter.com/redaksi18/status/804008416188338176"""
"""JustinBiermen"",""01/12/2016 0:06"",0.0,0.0,""Video LUCU Kecelakaan Yg Sangat Koplak http://www. youtube.com/watch?v=pQFOY7 AdXck …"",,,,""8,04E+17"",""https://twitter.com/JustinBiermen/status/804008714738880512"""

So the issue is not the delimiter, but rather the enclosure. As you can see, each line is wrapped in quotes. So the entire line is considered to be a single column.

I suggest to fix the csv, e.g. remove the quotes until a row looks like

"username","date","retweets","favorites","text","geo","mentions","hashtags","id","permalink"

If you cannot do that for some reason, preprocess the csv to clean it up:

print_r(
    array_map(
        function($line) {
            $single_quoted_line = str_replace(['"""', '""'], '"', $line);
            return str_getcsv($single_quoted_line);
        },
        file("tes.csv")
    )
);
Sign up to request clarification or add additional context in comments.

2 Comments

ah i see . i'll check it because this csv is generated from tweepy and i use rapid miner to clean the data and export it as csv again. that file is just few row. thanks i'll try it
it works with the original file with delimiter ";". thanks but i need to do some research about the file generated from rapid miner csv file....
1

Your CSV is formatted with each field enclosed in two " characters to the left and right, and then each line is also enclosed within a single " character to the left and right. As a result, each line in your CSV is read in as a string. That's why your result is an associative array with the entire header as a single string as the key, and the value associated with the key is also an entire line as a single string.

Try reformatting your CSV so each field is enclosed to the left and right in a single " character, and remove the additional " characters from the beginning and end of each line. Your code should then produce your expected results.

If you aren't able to control the format of the CSV, you'll need to do some sanitation before parsing with fgetcsv().

1 Comment

ah i see . i'll check it because this csv is generated from tweepy and i use rapid miner to clean the data and export it as csv again. that file is just few row. thanks i'll try it

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.