How to create a csv file format definition to load data into snowflake table

Question

I have a CSV file, a sample of it looks like this: Image of CSV file

Snowpipe is failing to load this CSV file with the following error:

Number of columns in file (5) does not match that of the corresponding table (3), use file format option error_on_column_count_mismatch=false to ignore this error

Can someone advise me csv file format definition to accomodate load without fail ?

Was your question resolved by any of the answers? please mark one as the solution if it was. — Simon D
– Simon D, Commented Jan 13, 2023 at 11:08

Simon D · Accepted Answer · 2023-01-13 11:07:06Z

4

The issue is that the data you are trying to load contains commas (,) inside the data itself. Snowflake thinks that those commas represent new columns which is why it thinks there are 5 columns in your file. It is then trying to load these 5 columns into a table with only 3 columns resulting in an error.

You need to tell Snowflake that anything inside double-quotes (") should be loaded as-is, and not to interpret commas inside quotes as column delimiters.

When you create your file-format via the web interface there is an option which allows you to tell Snowflake to do this. Set the "Field optionally enclosed by" dropdown to "Double Quote" like in this picture:

Alternatively, if you're creating your file-format with SQL then there is an option called FIELD_OPTIONALLY_ENCLOSED_BY that you can set to \042 which does the same thing:

CREATE FILE FORMAT "SIMON_DB"."PUBLIC".sample_file_format 
    TYPE = 'CSV' 
    COMPRESSION = 'AUTO' 
    FIELD_DELIMITER = ',' 
    RECORD_DELIMITER = '\n' 
    SKIP_HEADER = 0 
    FIELD_OPTIONALLY_ENCLOSED_BY = '\042' # <---------------- Set to double-quote
    TRIM_SPACE = FALSE 
    ERROR_ON_COLUMN_COUNT_MISMATCH = TRUE 
    ESCAPE = 'NONE' 
    ESCAPE_UNENCLOSED_FIELD = '\134' 
    DATE_FORMAT = 'AUTO' 
    TIMESTAMP_FORMAT = 'AUTO';

edited Jan 13, 2023 at 11:07

answered Feb 1, 2021 at 10:29

Simon D

6,3752 gold badges25 silver badges41 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Kharthigeyan Over a year ago

What do they mean by compression AUTO?

Simon D Over a year ago

It means that Snowflake will automatically detect the type of compression and try to decompress the file for you

Sriga · Accepted Answer · 2021-02-01 05:17:51Z

0

If possible share the file format and one sample record to figure out the issue. Seems issue with number of column, Can you include field_optionally_enclosed_by option into your copy statement and try it once.

answered Feb 1, 2021 at 5:17

Sriga

1,4391 gold badge8 silver badges14 bronze badges

Comments

Francesco Quaratino · Accepted Answer · 2021-02-27 12:25:31Z

0

When the TAB character is unlikely to occur, I tend to use TAB delimited files - which, also - together with a Header -, make the source files more human-readable in case they need to be open for troubleshooting loading failures:

FIELD_DELIMITER = '\t'

Also (although a bit off-topic), note that Snowflake suggests files to be compressed: https://docs.snowflake.com/en/user-guide/data-load-prepare.html#data-file-compression

I mostly use GZip compression type:

COMPRESSION = GZIP

A (working) example:

CREATE FILE FORMAT Public.CSV_GZIP_TABDELIMITED_WITHHEADER_QUOTES_TRIM
    FIELD_DELIMITER = '\t'
    SKIP_HEADER = 1
    TRIM_SPACE = TRUE
    NULL_IF = ('NULL')
    COMPRESSION = GZIP
;

answered Feb 27, 2021 at 12:25

Francesco Quaratino

6103 silver badges9 bronze badges

Collectives™ on Stack Overflow

How to create a csv file format definition to load data into snowflake table

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related