7

I have a CSV file that I'm trying to load into a PostgreSQL 9.2.4 database using the COPY FROM command. In particular there is a timestamp field that is allowed to be null, however when I load "null values" (actually just "") I get the following error:

ERROR:  invalid input syntax for type timestamp with time zone: ""

An example CSV file looks as follows:

id,name,joined
1,"bob","2013-10-02 15:27:44-05"
2,"jane",""

The SQL looks as follows:

CREATE TABLE "users"
(
    "id" BIGSERIAL NOT NULL PRIMARY KEY,
    "name" VARCHAR(255),
    "joined" TIMESTAMP WITH TIME ZONE,
);

COPY "users" ("id", "name", "joined")
    FROM '/path/to/data.csv'
    WITH (
        ENCODING 'utf-8',
        HEADER 1,
        FORMAT 'csv'
    );

According to the documentation, null values should be represented by an empty string that cannot contain the quote character, which is double quote (") in this case:

NULL

Specifies the string that represents a null value. The default is \N (backslash-N) in text format, and an unquoted empty string in CSV format. You might prefer an empty string even in text format for cases where you don't want to distinguish nulls from empty strings. This option is not allowed when using binary format.

Note: When using COPY FROM, any data item that matches this string will be stored as a null value, so you should make sure that you use the same string as you used with COPY TO.

I've tried the option NULL '' but that seems to have no affect. Advice, please!

3
  • As always, your version of Postgres please. Commented Feb 20, 2015 at 4:27
  • Hmm, 9.2.4 on my local machine. If I'm using 9.4 do you think FORCE_NULL would work? Commented Feb 20, 2015 at 14:33
  • Exactly. Goes to show that it's always important to declare your version of Postgres in such questions. Commented Feb 20, 2015 at 17:47

3 Answers 3

5

empty string without quotes works normally:

id,name,joined
1,"bob","2013-10-02 15:27:44-05"
2,"jane",

select * from users;
id | name |         joined
----+------+------------------------
 1 | bob  | 2013-10-03 03:27:44+07
 2 | jane |

maybe it would be simpler to replace "" with empty string using sed.

Sign up to request clarification or add additional context in comments.

2 Comments

Ok, I'll consider it - the problem is that these CSVs are coming to us from an external source, and I'm not the one manually preprocessing everything as it comes it - I don't really want to add an extra step downstream.
Worked here. Is this the CSV "standard"? (Is there one?) :-)
1

The FORCE_NULL option for COPY FROM in Postgres 9.4+ would be the most elegant way to solve your problem. Per documentation:

FORCE_NULL

Match the specified columns' values against the null string, even if it has been quoted, and if a match is found set the value to NULL. In the default case where the null string is empty, this converts a quoted empty string into NULL. This option is allowed only in COPY FROM, and only when using CSV format.

Of course, it converts all matching values in all columns.

In older versions, you can COPY to a temporary table with the same table layout - except data type text for the problem column. Then fix offending values and INSERT from there:

Comments

0

Could not get it to work. Ended up using this program: http://neilb.bitbucket.org/csvfix/

With that you can replace empty fileds with other values.

So for example in your case column 3 needs to have a timestamp value, so I give it a fake one. In this case '1900-01-01 00:00:00'. if needed you can delete or filter them out once the data is imported.

$CSVFIXHOME/csvfix  map -f 3 -fv '' -tv '1900-01-01 00:00:00'  -rsep ',' $YOURFILE > $FILEWITHDATES

After that you can import the newly created file.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.