7

I'm trying to unload a table data from postgres database into amazon s3.

I'm aware that redshift has a option of unload into s3 - Since redshift is a postgres database, I tried using the same command in my postgres database but was unsuccesful.

Can someone help me with unloading table data from postgres into s3 periodically ?

2
  • COPY ... TO PROGRAM Commented Mar 10, 2017 at 7:12
  • and if you are not superuser, just split to two steps: copy to file and aws s3 cp file to bucket Commented Mar 10, 2017 at 8:10

4 Answers 4

6

Redshift is based on a PostgreSQL clone but there's not 1-1 feature correspondence. If you want to load data from a PostgreSQL DB to Redshift, through S3, you should:

  1. Unload your data from PostgreSQL to a CSV file. To do that use the copy command of psql. See also this Question here.
  2. Copy the CSV file on S3. There are different ways to do that but check the documentation here
  3. Use the COPY command to load the data from S3 to Redshift
Sign up to request clarification or add additional context in comments.

Comments

2

On Redshift you can create a table to receive the data:

CREATE TABLE redshift_schema.redshift_table (...);

Then create a foreign data wrapper, server and a virtual phantom of the table in PostgreSQL RDS:

CREATE EXTENSION redshift_fdw;

----optional
--CREATE FOREIGN DATA WRAPPER redshift_fdw
--HANDLER postgres_fdw_handler
--VALIDATOR postgres_fdw_validator
--OPTIONS ();

CREATE SERVER redshift_server_mydb
FOREIGN DATA WRAPPER redshift_fdw
OPTIONS (dbname 'mydb', port '5439', connect_timeout '200000', host 'myhost.redshift.amazonaws.com');

CREATE USER MAPPING FOR mypguser
SERVER redshift_server_mydb
OPTIONS (user 'myrsuser', password 'mypassword');

IMPORT FOREIGN SCHEMA redshift_schema 
LIMIT TO (redshift_table) 
FROM SERVER redshift_server_mydb
INTO postgresql_schema;

Now in PostgreSQL you can (in a function if you like) load (select, insert, update, delete) the Redshift table from the PostgreSQL table (without using dblink):

INSERT INTO postgresql_schema.redshift_table
SELECT *
FROM postgresql_schema.postgresql_table;

Now when you look at the Redshift table all the data is there and you can UNLOAD the table to S3 as required.

Comments

1

I used the following code-snippet for that:

UNLOAD ('SELECT * FROM my_table') TO 's3://my-bucket/path' credentials 'aws_access_key_id=*****;aws_secret_access_key=*****' parallel off delimiter ',' ALLOWOVERWRITE ADDQUOTES HEADER ESCAPE gzip;

You can check also the following links:

https://docs.aws.amazon.com/redshift/latest/dg/t_Unloading_tables.html https://docs.aws.amazon.com/redshift/latest/dg/r_UNLOAD_command_examples.html

Comments

1

To export data from a PostgreSQL RDS instance to S3, use the aws_s3 extension: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/postgresql-s3-export.html

CREATE EXTENSION aws_s3 CASCADE;

SELECT * from aws_s3.query_export_to_s3('select * from sample_table', 
   aws_commons.create_s3_uri('sample-bucket', 'sample-filepath', 'us-west-2')
);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.