I have a "big" csv file (around 1GB of data, 3M lines) to import into a MariaDB table.
The thing is, almost every field of every line is considered as a string. So, I have to convert "1 337" (string) into 1337 (integer).
Here is the script used to import into the table :
LOAD DATA LOW_PRIORITY LOCAL
INFILE 'data.txt'
INTO TABLE `test`.`test_import`
CHARACTER SET utf8
FIELDS TERMINATED BY ';'
OPTIONALLY ENCLOSED BY '"'
ESCAPED BY '"'
LINES TERMINATED BY '\r\n'
(`id`,
`data`,
@NumberOne,
@NumberTwo,
@NumberThree,
@NumberFour)
SET `Number One` = REPLACE(@NumberOne, ' ', ''),
`Number Two` = REPLACE(@NumberOne, ' ', ''),
`Number Three` = REPLACE(@NumberOne, ' ', ''),
`Number Four` = REPLACE(@NumberOne, ' ', '');
Using this script, there is no problem importing numbers below 999. But beginning with 1000 (written "1 000" in my csv), all I have is a warning (Truncated incorrect INTEGER value: '1 000') and the value 1 in my database.
The "funny" thing is, when I try with this :
SET `Number One` = REPLACE(@NumberOne, '1', 'k'),
`Number Two` = REPLACE(@NumberOne, '1', 'k'),
`Number Three` = REPLACE(@NumberOne, '1', 'k'),
`Number Four` = REPLACE(@NumberOne, '1', 'k')
-> REPLACE() works, "1 000" become "k 000".
So, how can I use REPLACE() to remove spaces in numbers ? Or, how to make CAST()/CONVERT() working correctly on strings like "1 337" ?
Some more informations.
Here is a fresh test table :
CREATE OR REPLACE TABLE test_spaces_extr (
`Identifier` tinytext,
`First name` tinytext,
`Last name` tinytext,
`Number One` int unsigned,
`Number Two` int unsigned,
`Number Three` int unsigned,
`Number Four` int unsigned,
`Number Five` int unsigned,
`Number Six` int unsigned,
`Number Seven` int unsigned
);
Here is the script to import CSV :
LOAD DATA LOW_PRIORITY LOCAL
INFILE 'some_data.txt'
INTO TABLE `test`.`test_spaces_extr`
CHARACTER SET utf8
FIELDS TERMINATED BY ';'
OPTIONALLY ENCLOSED BY '"'
ESCAPED BY '"'
LINES TERMINATED BY '\r\n'
(`Identifier`,
`First name`,
`Last name`,
@NumberOne,
@NumberTwo,
@NumberThree,
@NumberFour,
@NumberFive,
@NumberSix,
@NumberSeven)
SET `Number One` = REPLACE(@NumberOne, ' ', ''),
`Number Two` = REPLACE(@NumberTwo, ' ', ''),
`Number Three` = REPLACE(@NumberThree, ' ', ''),
`Number Four` = REPLACE(@NumberFour, ' ', ''),
`Number Five` = REPLACE(@NumberFive, ' ', ''),
`Number Six` = REPLACE(@NumberSix, ' ', ''),
`Number Seven` = REPLACE(@NumberSeven, ' ', '');
Here is the full content of some_data.txt :
"3efa639b3a";"Censored";"Censored";"7 896";"3 468";"3 854";"5 000";"1 234";"9 654";"1 337"
(One line, yes.)
Here is the result :
"Identifier" "First name" "Last name" "Number One" "Number Two" "Number Three" "Number Four" "Number Five" "Number Six" "Number Seven"
"3efa639b3a" "Censored" "Censored" "7896" "3468" "3854" "5000" "1234" "9654" "0"
In fact, the "Number" fields became integers here. All of them, but not the last one ("Number Seven" -> "0").
It's becoming weirder ...
unsigned int?CAST(REPLACE(@var, ' ', '') as unsigned), it won't work.It's like if REPLACE() doesn't want to remove spaces.=SUBSTITUTE(A1," ","")to remove spaces from numeric and then load in MySQL ?1 000or1in your database?