4

I have avro files in S3 which I want to be able to query via Redshift. Have used external tables with success in the past but only in parquet/JSON format so wondering whether I'm missing something with the data being in avro format maybe.

I set up a glue crawler to get hold of the schema of the files and that has worked fine. I can access the data in Athena. I've also set up an external schema in Redshift and can see the new external table exists when I query SVV_EXTERNAL_TABLES. However, when I come to query the new table I get the following error:

[XX000][500310] Amazon Invalid operation: Invalid DataCatalog response for external table "spectrum_google_analytics"."man": Cannot deserialize Table. Error:

I don't know why this would work for athena but not spectrum. Hoping you can help. Thanks!

1 Answer 1

1

The same issue happened to be as well when I was trying to use aws-cdk for deploying resources. Turns out having no parameters in properties of Glue Table will cause this weird behaviour (https://github.com/aws/aws-cdk/issues/7826), add some property like classification=Parquet/JSON and try again, worked for me.

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.