0

Following this guide https://clickhouse.com/docs/knowledgebase/mysql-to-parquet-csv-json I've exported from a MySQL server some tables to parquet.

But I'm not able to read these parquet files with DuckDB.

I can inspect the structure:

DESCRIBE SELECT * FROM 'mytable.parquet';

but if I try to read:

select ID from mytable.parquet;
Error: Invalid Error: Unsupported compression codec "7". Supported options are uncompressed, gzip, snappy or zstd

I guess that clickhouse is writing LZ4 compressed parquet files, and duckdb doesn't support them. Can I change the compression format in clickhouse-local?

2 Answers 2

5

To change Parquet compression method in ClickHouse, use setting output_format_parquet_compression_method (see all Parquet settings in https://clickhouse.com/docs/en/sql-reference/formats#parquet-format-settings).

For example:

select ... format Parquet settings output_format_parquet_compression_method='snappy'
Sign up to request clarification or add additional context in comments.

Comments

1

--output_format_parquet_compression_method Compression method for Parquet output format. Supported codecs: snappy, lz4, brotli, zstd, gzip, none (uncompressed)

try output_format_parquet_compression_method='snappy'

clickhouse-client -q "select * from numbers(1e6)  settings
 output_format_parquet_compression_method='snappy' format Parquet " > test.parquet

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.