CDC -- Pipeline1 work is to load data (have list of tables) based on timestamp columns (creation_date, updation_date) from replica db (RDS) to S3 (landing_zone)
If I created rds connection on glue, would I need to use jdbc again as I told I need to pass tables and filter conditions, else need to load data and filter after loading.
connection_options = { "connectionName": "myconnection", "database": "dbname" }
for table in tables_list: connection_options['dbtable'] = table # how to pass query if we filter here only datasource = glueContext.create_dynamic_frame.from_options( connection_type="postgresql", connection_options=connection_options ) output_path = "" glueContext.write_dynamic_frame.from_options( frame=datasource, connection_type="s3", connection_options={"path": output_path}, format="csv" )
using jdbc i can able to pass query and get data but (the connection is configured with vpc, without vpn it can't connected)
Any solutions to filter or is it ok to filter on loading data.